Award  Number:  DAMD17-00- 1-0417 


TITLE:  Environmental  Exposures  at  Birth  and  at  Menarche  and  Risk 

of  Breast  Cancer 


PRINCIPAL  INVESTIGATOR:  Jo  L.  Freudenheim,  Ph.D. 


CONTRACTING  ORGANIZATION:  New  York  State  University  Research 

Foundation 

Amherst,  New  York  14228-2567 


REPORT  DATE:  June  2004 


TYPE  OF  REPORT:  Final 


PREPARED  FOR:  U.S.  Army  Medical  Research  and  Materiel  Command 
Fort  Detrick,  Maryland  21702-5012 


DISTRIBUTION  STATEMENT:  Approved  for  Public  Release; 

Distribution  Unlimited 


The  views,  opinions  and/or  findings  contained  in  this  report  are 
those  of  the  author (s)  and  should  not  be  construed  as  an  official 
Department  of  the  Army  position,  policy  or  decision  unless  so 
designated  by  other  documentation. 


20041123  077 


BEST  AVAILABLE  COPY 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
0MB  No.  074-0188 


Public  reporting  burden  for  this  ccrilection  of  infomnation  is  estimated  to  average  1  hour  per  response,  including  the  time  for  reviewing  instructions,  searching  existing  data  sources,  gathering  and  maintaining 
the  data  needed,  and  completing  and  reviewing  this  collection  of  information.  Send  comments  regarding  this  burden  estimate  or  any  other  aspect  of  this  collection  of  information,  including  suggestions  for 
reducing  this  burden  to  Washington  Headquarters  Services,  Directorate  for  Information  Operations  and  Reports,  1215  Jefferson  Davis  Highway,  Suite  1204,  Arlington,  VA  22202-4302,  and  to  the  Office  of 
Manaoement  and  Budget,  Paperwork  Reduction  Project  (0704-0188),  Washington,  DC  20503 


1 .  AGENCY  USE  ONLY  2.  REPORT  DATE  3.  REPORT  TYPE  AND  DATES  COVERED 

(Leave  blank)  June  2004  Final  (1  Jun  00-31  May  04) 


4.  TITLE  AND  SUBTITLE 

Environmental  Exposures  at  Birth  and  at  Menarche  and  Risk 
of  Breast  Cancer 


5.  FUNDING  NUMBERS 

DAMD17-00-1-0417 


6.  AUTHOR(S) 

Jo  L.  Freudenheim,  Ph.D. 


7.  PERFORMING  ORGANIZATION  NAME(S}  AND  ADDRESS(ES) 

New  York  State  University  Research  Foundation 
Amherst,  New  York  14228-2567 


8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


E-Mail:  j  f  reuden@buf  f  alo .  edu 


9.  SPONSORING  /  MONITORING 

AGENCY  NAME(S)  AND  ADDRESS{ES) 

U.S.  Army  Medical  Research  and  Materiel  Command 
Fort  Detrick,  Maryland  21702-5012 


10.  SPONSORING  /  MONITORING 
AGENCY  REPORT  NUMBER 


12a.  DISTRIBUTION  /  AVAILABILITY  STATEMENT 

Approved  for  Public  Release;  Distribution  Unlimited 


12b.  DISTRIBUTION  CODE 


13.  ABSTRACT  (Maximum  200  Words) 

This  population-based  study  examines  early  life  exposure  to  environmental  pollutants  from  industrial  sites,  toxic  waste 
sites  and  heavily  trafficked  roadways  as  risk  factors  for  breast  cancer;  with  a  focus  on  exposure  to  benzene  and 
phenylalanine  hydroxylase  (PAHs).  We  have  geocoded  15,340  individual  addresses  for  3,286  participants  in  Erie  and 
Niagara  counties  in  New  York  State.  A  validation  study  assessed  the  positional  accuracy  of  addresses  geocoded  on  the 
Dynamap2000  using  a  global  positioning  system  receiver.  Overall,  geocoding  was  accurate.  Analyses  have  been 
completed  examining  residential  proximity  to  industrial  sites  contracting  with  the  US  Atomic  Energy  Commission 
(USAEC),  for  exposure  to  total  suspended  particulates  (TSP),  and  exposure  to  environmental  tobacco  smoke  (ETS)  and 
breast  cancer  risk.  Proximity  to  sites  contracted  by  USAEC  was  not  associated  with  risk.  Exposure  to  TSP  in  early  life 
was  associated  with  a  2.75-fold  increase  in  risk  for  postmenopausal  women  only.  There  was  little  evidence  of  an 
association  between  early  life  exposure  to  ETS  and  breast  cancer.  Clustering  analyses  identified  geographic  patterns  of 
residence  for  breast  cancer  cases  and  controls  at  critical  time  periods  in  early  life.  These  results  provide  evidence  that 
environmental  exposures  in  early  life  may  be  important  for  breast  cancer  risk. 


14.  SUBJECT  TERMS 

Cancer  etiology,  carcinogenesis,  environment 
epidemiology,  gene -environment  interaction, 
exposures 

,  geography,  molecular 
early  childhood 

15.  NUMBER  OF  PAGES 

208 

16.  PRICE  CODE 

17.  SECURITY  CLASSIFICATION 

18.  SECURITY  CLASSIFICATION 

19.  SECURITY  CLASSIFICATION 

20.  LIMITATION  OF  ABSTRACT 

OF  REPORT 

OF  THIS  PAGE 

OF  ABSTRACT 

Unclassified 

Unclassified 

Unclassified 

Unlimited 

NSN  7540-01-280-5500 


Standard  Form  298  (Rev.  2-89) 

Prescribed  by  ANSI  Std.  Z39-18 
298-102 


Table  of  Contents 


Cover . 1 

SF  298 . 2 

Table  of  Contents . 3 

Introduction . 4 

Body . . . 5 

Conclusions . 109 

Project  Personnei . 110 

Key  Research  Accomplishments . 112 

Reportable  Outcomes . 114 

Appendices . 117 


INTRODUCTION 


In  this  population-based  study  we  examined  environmental  exposures  experienced 
at  birth  and  at  menarche  as  risk  factors  for  breast  cancer.  We  examined  location  of 
residence  during  these  potentially  sensitive  time  periods  in  relation  to  proximity  to 
industrial  sites,  toxic  waste  sites,  and  heavily  trafficked  roadways  as  risk  factors  for 
subsequent  disease.  Residential  histories  were  obtained  from  all  participants  in  our 
previously  conducted  case-control  study,  which  included  women  age  35-79  with  incident, 
primary,  histologically  confirmed  breast  cancer  and  living  in  Erie  or  Niagara  counties. 
Randomly  selected  controls  were  frequency  matched  to  cases  on  age,  race,  and  county  of 
residence.  Residence  at  the  time  of  birth  and  menarche  and  potential  exposure  sites  were 
geocoded  into  GIS.  Primary  objectives  of  this  study  2ire:  1)  To  investigate  distance  from 
steel  mills,  chemical  factories,  toxic  waste  sites,  other  industrial  sites  and  major  roadways 
of  the  residence  of  cases  and  controls  at  the  time  of  birth  and  at  menarche  as  risk  factors 
for  pre  and  postmenopausal  breast  cancer.  2)  To  examine  estimated  exposure  to  benzene 
and  to  PAHs  as  risk  factors  for  pre  and  postmenopausal  breast  cancer.  3)  To  evaluate 
genetic  susceptibility  in  relation  to  these  exposures  and  breast  cancer  risk  by  examining 
genetic  variability  in  metabolism  by  NQOl,  GST  Ml,  GST  Pi,  and  CYPIAI.  We 
assessed  potential  confounding  factors  including:  age,  education,  income,  family  history 
of  breast  cancer,  Quetelet  index,  body  fat  distribution,  having  been  breastfed,  age  at 
menarche,  age  at  menopause,  pregnancy  history,  lactation  and  contraceptive  history, 
menstrual  cycle  length,  birth  weight,  smoking  and  passive  smoke  exposure  history,  and 
diet  and  occupational  history.  In  addition  to  our  original  aims  we  also  examined  1) 
clustering  of  cases  compared  to  controls  for  residence  at  birth  and  at  menarche;  2)  risk  of 
breast  cancer  associated  with  measured  total  suspended  particulates  in  the  air  for  birth 
and  menarche  residence  and;  3)  risk  of  breast  cancer  associated  with  passive  smoke 
exposure  in  childhood  (another  source  of  PAH  exposure).  Results  of  this  study  are 
discussed  in  the  text  of  this  report. 
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BODY  OF  REPORT 


Task  1;  Investigate  distance  from  steel  mills,  chemical  factories,  gasoline  stations^ 
toxic  waste  sites,  and  other  industrial  site  of  the  residence  of  cases  and  controls  at 
the  time  of  birth  and  at  menarche  as  risk  factors  for  pre-  and  postmenopausal 
breast  cancer. 


Task  1  is  completed.  We  have  identified  and  completed  data  entry  for  relevant  industrial 
sites  and  major  roadways  during  time  periods  under  investigation.  We  have  identified 
additional  sources  of  information  regarding  historical  sources  of  the  exposures  of  interest 
and  their  locations  and  amounts.  We  have  verified  and  geocoded  residential  histories  of 
study  participants.  We  have  completed  the  geocoding  for  study  participants  for  their 
residence  at  the  time  of  their  birth,  at  menarche,  when  they  had  a  first  birth  and  1 0  and  20 
years  before  diagnosis  (cases)  or  interview  (controls).  In  total,  we  have  geocoded 
approximately  20,000  addresses  in  Erie  and  Niagara  counties.  In  addition,  we  have 
conducted  a  validation  study  of  the  positional  accuracy  of  geocoded  residences.  Results 
of  this  validation  study  will  be  published  in  the  journal  Epidemiology  in  July  2003  (see 
APPENDIX  I).  We  have  also  completed  data  analysis  examining  early  life  proximity  to 
industrial  sites  contracted  by  the  United  States  Atomic  Energy  Commission  in  relation  to 
risk  of  breast  cancer  in  adult  life.  Currently  a  manuscript  for  these  analyses  is  in 
preparation  and  will  be  submitted  for  publication  in  the  next  year. 

Two  abstracts  from  the  work  on  this  task  were  presented  at  the  annual  meeting 
of  the  Society  for  Epidemiologic  Research  in  Atlanta,  Georgia,  June  11-14, 2003  and  the 
abstracts  were  published  in  a  supplement  of  the  American  Journal  of  Epidemiology.  They 
are:  "Residential  Proximity  to  Chemical  or  Primary  Metal  Industry  and  the  Risk  of  Breast 
Cancer  in  Western  New  York,"  and  “Clustering  of  Lifetime  Residence  and  Breast  Cancer 
Risk  in  Western  New  York.”  Copies  of  the  abstracts  are  in  APPENDIX  V. 

We  completed  a  GIS-based  spatial  and  temporal  analyses  for  residences  of  breast 
cancer  cases  and  controls  at  early  life.  Since  we  found  strong  evidence  of  spatial 
clustering  for  cases  at  early  time  periods,  we  will  continue  with  the  next  step  which  is  to 
estimate  breast  cancer  risk  associated  with  environmental  exposures  at  those  early 
residences.  Epidemiologic  investigations  on  the  evaluation  of  environmental  risk  factors 
and  the  estimation  of  breast  cancer  risk  associated  with  lifetime  residential  history  will  be 
performed  with  the  aid  of  GIS  and  spatial  perspectives.  A  manuscript  is  in  preparation 
regarding  the  clustering  of  lifetime  residence  and  breast  cancer  risk  using  exploratory 
spatial  analysis  tools  based  on  these  lifetime  residential  history  data.  Descriptions  of 
findings  follow. 

A.  Analysis  of  Geographic  Clustering  of  Cases  and  Controls  by  Period  of  the 

Lifetime 

First,  we  completed  a  GIS-based  spatial  and  temporal  analysis  for  residences  of 
breast  cancer  cases  and  controls  at  early  life  and  found  strong  evidence  of  spatial 
clustering  for  cases  during  this  time.  A  paper,  “Geographic  Clustering  of  Residence  in 
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Early  Life  and  Subsequent  Risk  of  Breast  Cancer,”  has  been  accepted  for  publication  in 
Cancer  Causes  and  Control.  A  copy  of  the  complete  manuscript  is  included  in 
APPENDIX  11. 

Second,  we  examined  breast  cancer  risk  associated  with  lifetime  residential 
history  to  identify  spatio-temporal  patterns  of  risk  surfaces  in  a  population-based  case- 
control  study  of  breast  cancer.  With  a  growing  interest  in  early  or  lifetime  exposures  to 
breast  cancer  risk,  a  life-course  approach  was  adapted  to  see  whether  environments  in  early 
life  or  biological  processes  around  critical  events  in  a  life-course  may  be  related  to 
disease  in  adulthood  (Kuh  and  Ben-Shlomo,  1997;  Barker,  1992).  We  explored  the  use  of 
density  estimation  methods  in  epidemiologic  studies  as  GIS-based  exploratory  spatial 
analyses,  and  obtained  risk  surfaces  using  several  measures  such  as  smoothed  ratio  and 
standardized  difference.  These  risk  surfaces  were  produced  and  compared  between 
residences  for  pre-menopausal  and  post-menopausal  women.  It  provides  risk  surfaces 
from  lifetime  residential  history,  thus  more  accurate  estimates  than  density  surfaces  based 
on  only  current  residential  location. 

We  used  six  temporal  groups,  place  of  birth,  the  primary  residence  during  the  period 
of  menarche,  and  the  primary  residence  during  women’s  first  birth,  residence  10  and  20 
years  prior  to  diagnosis  for  the  cases  and  prior  to  interview  for  the  controls,  and  current 
addresses.  Geocoding  of  residential  locations  in  six  temporal  groups  are  essential  parts  of 
this  study  which  enables  us  to  record  each  individual’s  locational  information  as  x  andy 
coordinates  to  be  used  in  further  spatial  analyses.  Overall  address  matching  rates  were 
92.5%.  Table  1  is  a  summary  table  showing  the  final  numbers  of  cases  and  controls  for 
those  six  events  by  menopausal  status. 

We  first  identified  areas  with  higher  than  average  densities  of  breast  cancer  cases 
in  the  study  area  based  on  the  relative  densities  of  cases  to  controls.  Figure  1  shows  the 
residential  locations  of  breast  cancer  cases  and  controls  in  western  New  York.  There  are 
4,808  residential  locations  for  cases  with  1334  pre-menopausal  and  3470  post-menopausal 
residences,  while  there  are  8,580  residential  locations  for  controls  with  2559  pre¬ 
menopausal  and  6010  post-menopausal  residences.  We  produced  two  maps  of  risk  surface 
based  on  residential  locations  of  cases  and  controls  in  Figure  1.  Figure  2  shows  risk 
surfaces  of  pre-  and  post-menopausal  residences,  and  depict  only  areas  of  high  case 
densities  in  the  study  area  (ratio  greater  than  0.5).  For  instance,  areas  with  ratio  greater 
than  0.76  (with  contours)  indicate  two  times  of  increased  breast  cancer  risk. 

Second,  standardized  difference  in  case  and  control  density  is  obtained  to  assess 
variability  of  risk  surfaces.  Figure  3a  and  3b  depict  areas  of  greater  than  two  standard 
deviations  (with  contours),  and  those  statistically  significant  areas  with  images  (over  the 
critical  value  3.5).  There  are  several  areas  of  interest  for  pre-menopausal  residences;  two 
in  the  center  of  the  study  area,  and  one  in  rural  area,  while  only  one  in  rural  area  was 
detected  for  post-menopausal  residences.  These  are  statistically  significant  areas 
exceeding  the  critical  value  at  a=.01,  indicating  density  of  cases  are  significantly  higher 
than  that  of  controls.  However,  interpretation  should  be  cautious  on  the  ones  appeared  in 
rural  area  for  both  pre-menopausal  and  post-menopausal  residences.  As  seen  in  Figure  1 , 
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there  are  very  sparse  residences  in  those  areas.  Although  it  is  suggested  that  the 
difference  between  case  and  control  density  is  significantly  different,  it  is  not  reliable 
because  small  sample  size  may  influence  on  the  results. 

Finally,  we  were  interested  in  finding  time  periods  contributing  to  this  result. 
Standardized  difference  is  obtained  for  each  temporal  group,  and  pre-  and  post¬ 
menopausal  residences  separately.  Figure  4  and  5  shows  risk  surfaces  difference  in  space 
and  time.  As  seen  in  Figures  4  and  5,  areas  greater  than  2  S.D.  are  illustrated  for  each 
temporal  group  with  darker  images  as  statistically  significant  areas  of  high  case  density. 
Testing  for  significance  are  performed  and  attached  as  p- values.  We  found  three  time 
periods  of  interests;  birth,  menarche  and  10  years  for  pre-menopausal  residences.  These 
areas  are  significantly  different  at  a.=0.1  and  indicate  areas  of  high  case  densities.  For 
post-menopausal  residences,  we  found  one  significant  time  period  (20  years)  at  the  .1 
level.  Once  again  ones  in  rural  area  for  10  years  in  Figure  4  and  20  years  in  Figure  5 
seem  to  appear  due  to  small  sample  size  of  the  area.  A  manuscript  describing  these  results 
is  in  preparation. 


Table  1.  Residential  history  of  breast  cancer  cases  and  controls 


Cases 

Pre- 

A-Complete  Addresses 
Controls 

Post-  All  Pre-  Post- 

All 

B-Missing 

Cases  Controls 

C-Total 

(%A/C) 

Birth 

160 

283 

505 

345 

521 

804 

127 

189 

1642(80) 

Menarche 

204 

386 

673 

469 

757 

1143 

98 

154 

2068(88) 

First  birth 

181 

371 

616 

435 

782 

1153 

97 

167 

2033(87) 

20  years* 

210 

672 

882 

413 

1201 

1614 

96 

157 

2749(91) 

10  years* 

258 

717 

975 

501 

1266 

1767 

74 

133 

2949(93) 

Current 

327 

826 

1157 

619 

1469 

2099 

12 

18 

3286(99) 

Total 

1334 

3470 

4808 

2559 

6010 

8580 

504 

818 

14727(91) 

*  20  and  10  years  prior  to  diagnosis  or  control  selection  respectively 
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Pre-menopausal  residence 


a)  Cases(n=1334) 


410  420  430  440 


Post-menopausal  residence 
c)  Cases  (n=3470) 


410  420  430  440 


b)  Controls  {n=2559) 


b)  Controls  {n=6010) 


Easting 


Easting 


Figure  1.  Geographic  distribution  of  breast  cancer  cases  in  western  New  York:  All 
residential  locations  of  breast  cancer  cases  and  controls  included  in  the  analysis 
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2920  2940  2960  2980 


a)  Relative  risk  surfaces  (pre-menopausal)  b)  Relative  risk  surfaces  (post-menopausal) 


400  410  420  430  440  400  410  420  430  440 


Figure  2.  Risk  surfaces  for  pre-menopausal  and  post-menopausal  residence 
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Northing 


a)  Difference  surfaces  (pre-menopausal) 


b)  Difference  surfaces  (post-menopausai) 


Figure  3.  Difference  in  risk  surfaces  for  pre-menopausal  and  post-menopausal  residence 
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2S70 


a)  Birth  (p=0.08) 


b)  Menarche  (p=0.10) 


c)  First  birth  (p=0.28) 


NortNng  NortWng 


a)  Birth  (p=0.50) 


b)  Menarche  (p=0,74) 


c)  First  birth  (p=0.38) 


410  420  430  440  410  420  430  440  410  420  430  440 


d)  20  years  (p=0.08)  e)  10  years  (p=0.22)  f)  Current  (p=0,22) 


410 

420  430 

440 

410 

420  430 

440 

410 

420  430 

440 

Easing 

Easing 

Easting 

Figure  5.  Risk  surface  difference  in  space  and  time:  post-menopausal  residence 
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B.  Examination  of  Breast  Cancer  Risk  in  Relation  to  Residential  Proximity  to 
Industrial  Sites  Contracted  by  the  U.S.  Atomic  Energy  Commission 

Ionizing  radiation  is  a  well  recognized  human  mammary  carcinogen.  Numerous 
studies  including  those  of  Japanese  atomic  bomb  survivors  and  tuberculosis  patients 
treated  with  radiation  have  shown  that  breast  epithelium  is  radiosensitive  to  external 
radiation.  In  addition,  exposure  at  early  age  appears  to  be  particularly  important.  Less  is 
known  about  the  effects  of  internal  radiation  on  breast  epithelium.  The  epidemiologic 
evidence  regarding  internal  emitters  and  breast  cancer  comes  primarily  from  radium  dial 
workers  and  from  German  patients  treated  with  high  doses  of  radium-224  for  ankylosing 
spondylitis  and  tuberculosis.  In  the  German  patients,  adult  women  treated  with  radiun- 
224  had  an  SIR  of  1.77  for  breast  cancer,  while  women  <21  years  of  age  when  treated 
with  radium-224  had  an  SIR  of  9.4,  further  suggesting  that  early  life  exposures  may  be 
important  in  mammary  carcinogenesis,  albeit  at  high  doses.  The  effects  of  low-dose 
exposure  to  the  general  population,  however,  have  not  been  demonstrated  and  are 
generally  extrapolated  from  high-dose  exposures. 

The  general  population  is  exposed  to  low-dose  internal  and  external  radiation  from 
both  natural  and  anthropogenic  sources.  Natural  sources  include  decay  of  uranium 
present  in  soil,  dissolved  in  water,  and  absorbed  by  plants  and  animals  that  are  consumed. 
Man-made  sources  include  therapeutic  radioactive  isotopes,  consumer  products  such  as 
tobacco  and  smoke  detectors,  the  nuclear  power  industry,  and  the  military  nuclear 
industry. 

In  the  1940’s  and  1950’s,  the  United  States  Atomic  Energy  Conunission  (USAEC) 
and  its  predecessor  the  Manhattan  Engineering  Project  (currently,  the  United  States 
Department  of  Energy)  contracted  with  numerous  private  industries  to  process  uranium 
for  the  burgeoning  nuclear  program.  In  Erie  and  Niagara  Covmties  in  Western  New  York 
State,  13  such  industrial  sites  contracted  with  the  USAEC  to  enrich  uranium,  mill 
uranium  metal,  or  to  store  radioactive  waste.  In  addition  to  these  USAEC  activities,  these 
sites  were  also  engaged  in  commercial  industrial  activities  such  as  steel  and  chemical 
manufacturing.  In  this  case-control  study  we  examined  women’s  residential  proximity  to 
these  sites  at  the  time  of  their  birth,  at  menarche,  and  when  they  had  their  first  birth  in 
relation  to  breast  cancer  in  adult  life.  These  industrial  sites  were  examined  because  they 
were  relatively  close  to  residential  neighborhoods  and  were  engaged  in  processing 
radioactive  material  that  resulted  in  residual  environmental  contamination  at  most  sites. 
We  have  focused  on  early  life  exposure  because  it  appears  that  it  is  the  critical  time 
period  in  breast  development  when  breast  epithelium  is  particularly  sensitive  to  effects  of 
ionizing  radiation. 

For  this  study,  we  postulated  that  women  exposed  in  early  life  to  radiation  from 
uranium-238,  uranium-235,  radium-226  and  thorium-232  from  USAEC  site  activities 
would  be  more  likely  to  develop  breast  cancer  than  women  without  these  early  life 
exposures.  Specifically,  we  hypothesized  that  women  born  in  close  proximity  to  USAEC 
sites  would  be  more  likely  to  develop  breast  cancer  than  women  bom  further  away.  In 
addition,  we  also  predicted  that  women  who  resided  in  close  proximity  at  the  time  of 
menarche  and  at  first  birth  would  also  have  increased  odds  of  breast  cancer  compared  to 
women  residing  further  away  from  these  sites  at  menarche  and  first  birth. 
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Methods 


A  population-based,  case-control  study  was  conducted  to  evaluate  the  proposed 
hypotheses.  Cases  consisted  of  1,166  women  aged  35-79  living  in  Erie  or  Niagara 
County  diagnosed  with  histologically  confirmed,  primary,  incident  breast  cancer  between 
the  years  1996  and  2001 .  Controls  under  65  years  of  age  were  randomly  selected  from 
the  New  York  State  Department  of  Motor  Vehicles  driver’s  license  list  and  controls  65 
and  over  were  randomly  selected  from  the  Healthcare  Financing  Administration 
Medicare  rolls.  Controls  {n  =  2,105)  were  frequency  matched  to  cases  on  age,  race,  and 
county  of  residence.  The  response  rates  for  the  cases  were  59%  and  35%  for  the  controls. 
Refusal  to  participate  was  the  he  most  common  reason  for  both  cases  and  controls.  These 
estimates  of  response  are  somewhat  conservative  in  that  they  include  in  the  denominator 
1 8%  of  cases  and  45%  of  controls  where  eligibility  could  not  be  determined.  With  these 
individuals  removed  from  the  denominator,  the  response  rates  were  72%  for  cases  and 
79%  for  controls.  The  true  response  rate,  however,  most  likely  lie  somewhere  between 
these  two  estimates  for  cases  and  controls. 

Extensive  in-person  interviews  and  self-administered  questionnaires  were  used  to 
ascertain  medical  history,  diet,  alcohol  consumption,  smoking  history  (including  passive 
smoke  exposure),  residential  history,  and  occupational  history.  Each  participant  listed  all 
their  resideneies  for  their  lifetime  starting  with  the  address  at  the  time  of  interview. 

When  a  subject  could  not  provide  a  complete  address  in  Erie  or  Niagara  County,  Polk 
directory  and  city  directory  were  searched  to  find  this  missing  information.  These 
histories  were  used  to  locate  each  subject’s  residence  at  birth,  menarche,  and  first  birth. 

For  the  proximity  at  birth  analyses,  cases  and  controls  were  restricted  to  those  with 
birth  addresses  in  Erie  or  Niagara  Counties.  Of  these,  a  further  241  cases  and  380 
controls  were  excluded  from  these  analyses  because  they  were  bom  prior  to  the  period  of 
USAEC  activities  in  this  region  (1942-1956).  A  total  of  261  cases  and  424  controls  were 
included  in  the  birth  analyses. 

For  the  analyses  assessing  exposure  at  menarche,  cases  and  controls  were  restricted  to 
those  with  an  address  in  Erie  or  Niagara  Counties  at  the  time  of  menarche  during  the 
period  of  USAEC  activities,  leaving  581  cases  and  918  controls.  For  the  analyses 
assessing  proximity  at  first  birth,  cases  and  controls  were  restricted  to  those  with  an 
address  in  Erie  or  Niagara  Counties  at  the  time  of  birth  of  their  first  child.  Of  these,  one 
case  and  14  controls  were  excluded  because  their  first  birth  occurred  prior  to  USAEC  site 
activities  for  a  total  of  615  cases  and  1,139  controls. 

Exposure  Assessment 

Proximity  to  USAEC  industrial  sites  was  used  as  a  surrogate  for  exposure  to 
radioactive  pollution  emanating  from  these  sites.  Proximity  to  USAEC  sites  was 
calculated  in  a  two  step  process.  All  birth,  menarche,  and  first  birth  addresses  were 
geocoded  with  ArcView  3.2  (ESRI,  Inc.,  Redlands,  CA)  on  the  Dynamap2000  reference 
theme  (Geographic  Data  Technologies,  Inc.,  Lebanon,  NH).  The  addresses  of  all  13 
USAEC  sites  were  also  geocoded  onto  the  same  reference  theme.  An  extension  to 
AreView  3.2  was  used  to  calculate  a  distance  matrix  for  each  address  to  each  of  the  13 
USAEC  sites. 

Subjects  could  theoretically  be  exposed  to  pollutants  from  all  13  sites;  however,  we 
used  the  closest  site  to  calculate  proximity  with  the  rationale  that  the  closest  site  would 
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contribute  the  majority  of  exposure.  An  algorithm  was  created  in  SAS  (SAS  Institute, 
Inc.,  Cary,  NC)  to  determine  the  closest  site  dependent  on  the  year  that  the  subject’s 
critical  event  occurred  (i.e.,  birth,  menarche,  or  first  birth)  and  the  years  in  which  the 
USAEC  sites  were  actively  engaged  in  production  or  processing  of  radioactive  material. 
This  was  done  to  ensure  that  only  active  plants  were  used  to  determine  proximity.  For 
example,  there  were  only  two  plants  operating  in  1942.  Consequently,  for  participants 
bom  in  1942,  their  closest  site  would  be  one  of  those  two  plants.  We  opted  not  to  use 
those  subjects  bom  prior  to  USAEC  activities  as  the  tmly  unexposed  because  of  the 
potential  to  introduce  a  birth  cohort  effect.  Furthermore,  there  were  only  two 
premenopausal  women  who  were  bom  prior  to  1942,  precluding  a  comparison  of  this 
type  for  the  premenopausal  women. 

Radiological  surveys  for  9  of  the  13  sites  were  obtained  from  the  United  States  Army 
Corps  of  Engineers.  Radiological  surveys  were  conducted  primarily  by  the  National 
Laboratory  at  Oak  Ridge  to  assess  the  amount  of  radiological  contamination  present  on 
these  sites  for  the  Formerly  Utilized  Sites  Remedial  Action  Program.  These  surveys  were 
used  to  estimate  the  potential  for  off-site  contamination  and  to  estimate  radiologic  dose 
from  that  contamination.  Whole  body  dose  was  estimated  with  the  RESRAD  7.0  code 
(Argonne  National  Laboratory,  Argonne,  IL)  using  the  default  values  and  the  average  soil 
concentrations  of  uranium-238,  uranium-235,  radium-226,  and  thorium-232  at  the  Linde 
Ceramic  Plant,  which  had  the  highest  ground  concentrations  of  uranium-238,  uranium- 
235,  radium-226,  and  thorium-232  of  all  the  sites.  Average  soil  concentrations  of  these 
radionuclides  were  derived  from  the  radiological  surveys.  Radiological  surveys  for  the 
remaining  four  sites  either  were  never  done  or  could  not  be  located. 

Statistical  Analyses 

The  Student’s  T-test  and  Pearson’s  Chi-square  test  were  used  to  compare  demographic 
and  reproductive  characteristics  between  cases  and  controls.  The  distance  to  the  closest 
site  was  categorized  into  quartiles  based  on  the  distribution  of  distance  in  the  controls. 

The  closest  quartile  was  further  divided  in  two  at  the  midpoint  to  provide  higher 
resolution  for  the  closest  distances.  Logistic  regression  was  used  to  calculate  odds  ratios 
(OR)  and  95%  confidence  intervals  (95%  Cl)  for  each  quartile  compared  to  the  furthest 
quartile.  Multiple  logistic  regression  was  used  to  assess  potential  confounding  by  age  at 
interview,  race,  education,  age  at  menarche,  parity,  age  at  first  birth,  previous  benign 
breast  disease,  family  history  of  breast  cancer,  body  mass  index  (height  (m)/weight 
(kg)^),  and  age  at  menopause  for  postmenopausal  women.  All  models  were  stratified  by 
menopausal  status  to  assess  effect  measure  modification.  P  for  trend  statistics  were 
determined  by  the  /7-value  for  the  coefficient  of  the  continuous  exposure  variable,  while 
adjusting  for  covariates. 

Results 

Descriptive  characteristics  of  subjects  stratified  by  menopausal  status  are  depicted  in 
Table  1.  Cases  were  born  on  average  1  km  closer  to  a  USAEC  site  than  controls.  For 
postmenopausal  women,  cases  were  bom  about  0.5km  closer  than  controls. 

The  associations  between  proximity  to  USAEC  sites  at  birth  and  subsequent  breast 
cancer  are  shown  in  Table  2.  In  premenopausal  women,  proximity  within  3.3  kilometers 
of  a  USAEC  site  was  suggestive  of  a  slight  increase  in  the  odds  ratio  of  breast  cancer 
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(adjusted  OR  =  1.69,  95%  Cl  =  0.68-4.21),  although  there  were  few  women  in  this 
category.  There  was  no  evidence  of  a  linear  association  with  distance.  A  similar  pattern 
was  also  seen  with  the  postmenopausal  women  with  a  slightly  raised  breast  cancer  OR  for 
subjects  with  birth  residences  within  3.3  km  of  their  closest  USAEC  site  (adjusted  OR  = 
1.30,  95%  Cl  =  0.43-3.99).  Nevertheless,  the  confidence  intervals  for  both  pre-  and 
postmenopausal  women  are  also  consistent  with  no  increase  in  breast  cancer  risk. 

As  previously  mentioned,  the  1 3  USAEC  sites  were  engaged  in  various  uranium 
processing  activities  and  the  potential  for  the  general  population  to  be  exposed  to 
radionuclides  and  radiation  from  these  sites  may  have  differed  depending  on  the  activities 
at  that  site.  However,  proximity  to  either  waste  storage  facilities  or  uranium 
enrichment/metal  processing  sites  at  the  time  of  birth  was  not  associated  with  breast 
cancer  in  either  pre-  or  postmenopausal  women  (data  not  shown). 

In  Table  3,  ORs  and  95%  Cl  for  proximity  of  subjects’  addresses  at  menarche  to  the 
closest  USAEC  site  are  shown.  There  were  no  consistent  associations  with  proximity  to 
these  sites  and  risk  for  this  time  period  of  exposure.  For  premenopausal  women,  residing 
within  3.3  km  of  a  USAEC  site  at  menarche  compared  with  women  residing  15  km  or 
greater  at  menarche  the  OR  was  1 .42  (95%  Cl  =  0.46-4.34).  There  was,  however,  an 
apparent  reduction  in  the  OR  for  distances  between  3.3  and  10.2  km.  For 
postmenopausal  women,  proximity  <15  km  was  associated  with  a  reduction  in  the  OR. 
For  postmenopausal  women  residing  within  3.3  km  of  an  USAEC  sites,  the  OR  was  0.54 
(95%  Cl  =  0.28-1 .02).  Proximity  of  residence  at  first  birth  was  also  not  consistently 
associated  with  subsequent  breast  cancer  in  either  pre-  or  postmenopausal  women  (Table 

4). 

We  used  the  radiological  surveys  and  RESRAD  7.0  code  (Argonne  National 
Laboratory,  Argonne,  IL.)  to  estimate  whole  body  doses  of  ionizing  radiation  from  on¬ 
site  contamination  with  uranium-238,  uranium-235,  radium-226,  and  thorium-232.  In  a 
worst  case  scenario,  an  individual  residing  on  a  premises  of  a  1 000m  ,  with  the  average 
concentration  of  radionuclides  estimated  from  the  Linde  Plant,  would  have  an  Effective 
Dose  Equivalent  of  0.42  mSv/year,  which  is  within  the  range  of  background  radiation 
exposure  experienced  by  the  general  population  (3.6  mSv/yr). 
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Table  1.  Descriptive  Characteristics  (means  (SD)  and  percentages)  of  Study  Participants  Bom  in  Erie  and 
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Table  2.  Odds  Ratios  and  95%  Confidence  Intervals  for  Birth  Address  Distance  to  the  Closest  US  Atomic  Energy  Commission  Site. 
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Table  3.  Odds  Ratios  and  95%  Confidence  Intervals  for  Menarche  Address  Distance  to  the  Closest  US  Atomic  Energy  Commission 
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Table  4.  Odds  Ratios  and  95%  Confidence  Intervals  for  First  Birth  Address  Distance  to  the  Closest  US  Atomic  Energy  Commission 
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C.  Proximity  to  Chemical  or  Primary  Metal  Industrial  Sites 

Women  living  in  urban  environments  are  at  greater  risk  of  breast  cancer  than 
those  in  rural  settings;  this  difference  is  not  well  understood.  We  conducted  a  study  to 
examine  environmental  exposures  10  and  20  years  prior  to  diagnosis  (cases)  or  interview 
(controls)  in  relation  to  breast  cancer  risk,  in  particular  risk  associated  with:  1)  residential 
proximity  to  chemical  industry  sites;  2)  residential  proximity  to  primary  metal  industry 
sites.  It’s  a  population-based  case  control  study.  Cases  were  women,  age  35-79  v^th 
incident,  primary,  histologically  confirmed  breast  cancer  living  in  Erie  or  Niagara 
counties;  controls  were  population  based,  frequency  matched  to  cases  on  age,  race  and 
county.  Self-reported  lifetime  residential  histories  were  collected,  and  missing  address 
information  supplemented  with  Polk  Directory  searches.  863  cases  and  1579  controls 
with  complete  residential  addresses  for  the  periods  10  and  20  years  prior  to  diagnosis  (or 
interview  for  controls)  were  included  in  these  analyses.  Industrial  directories  for  New 
York  State  for  1978  and  1988,  were  used  to  identify  chemical  and  primary  metal  factories 
operating  in  this  region.  The  chemical  facility  in  our  study  includes  Standard  Industrial 
Classification  (SIC)  groups  28  (chemicals  and  allied  products),  29  (petroleum  refining 
and  related  industries),  and  30  (rubber  and  miscellaneous  plastics  products);  and  primary 
metal  facility  includes  SIC  33.  We  used  ArcView3.2  (using  GDT/Dynamap  as  the  base 
map)  to  geocode  the  addresses.  The  locations  of  industrial  sites  and  residences  are  list  in 
Figure  1  and  2.  Quartiles  were  created  to  categorize  the  distance  from  residential  address 
to  the  closest  industrial  site;  women  living  within  0.25  mile  of  a  facility  were  put  in  a 
separate  category.  We  used  logistic  regression  to  calculate  the  odds  ratios  and  95% 
confidence  intervals,  adjusting  for  age,  education,  race,  BMI,  age  at  first  birth,  age  at 
menarche,  age  at  menopause  (postmenopausal  women),  parity,  first-degree  relative  with 
breast  cancer,  and  previous  benign  breast  disease.  For  both  pre  and  postmenopausal 
women,  no  evidence  that  living  close  to  chemical  or  primary  metal  facility  10  and  20 
years  ago  was  associated  with  increased  risk  (Tables  1  to  4).  In  this  study,  we  used 
proximity  to  estimate  exposure.  However,  the  complexity  of  the  chemical  mixtures  from 
different  sites  likely  leads  to  exposure  misclassification.  While  our  measure  of  exposure 
to  any  single  compound  is  crude,  the  real  world  exposure  is  generally  to  mixtures  of 
compounds.  In  this  study  we  found  no  effect  of  exposure  to  these  mixtures  in  the  recent 
decades  on  breast  cancer  risk. 
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Figure  1.  Locations  of  Industries  and  Residence  by  Case-control  Status,  Premenopausal  Women 


Figure  2.  Locations  of  Industries  and  Residence  by  Case-control  Status,  Postmenopausal  Women 
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Table  1.  Residential  proximity  to  chemical  facility  20  yrs  ago  and  Risk  of  Breast  Cancer 

Categories  of  Distance  Cases  Controls  Crude  OR  Adjusted  OR*  P  for  trend 
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Table  2.  Residential  proximity  to  chemical  facility  10  yrs  ago  and  Risk  of  Breast  Cancer 

Categories  of  Distance  Cases  Controls  Crude  OR  Adjusted  OR*  P  for  trend 
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Table  3.  Residential  proximity  to  primary  metal  industry  20  yrs  ago  and  Risk  of  Breast  Cancer 

Categories  of  Distance  Cases  Controls  Crude  OR  Adjusted  OR*  P  for  trend 
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Table  4.  Residential  proximity  to  primary  metal  industry  10  yrs  ago  and  Risk  of  Breast  Cancer 

Categories  of  Distance  Cases  Controls  Crude  OR  Adjusted  OR*  P  for  trend 
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Task  2:  To  examine  estimated  exposure  to  benzene  and  to  PAHs  as  a  risk 
factor  for  pre-  and  postmenopausal  breast  cancer,  with  control  for 
appropriate  confounders. 

Task  2  is  complete.  We  have  geocoded  industrial  sites,  especially  major 
PAH  emitters  for  several  decades  and  will  continue  with  this  work  for  the  other 
relevant  periods.  We  are  also  working  on  the  development  of  geographic  models 
to  estimate  historical  PAH  exposure  from  traffic  and  industrial  sites. 

We  have  completed  data  analysis  examining  early  life  exposure  to  total 
suspended  particulates  and  exposure  to  environmental  tobacco  smoke  in  relation 
to  risk  of  breast  cancer  in  adult  life.  We  are  interested  in  both  of  these  exposures 
as  proxies  for  exposure  to  PAHs  and  benzene.  Two  manuscripts  for  these 
analyses  have  been  submitted.  Additional  publications  are  being  prepared. 

A.  Early  Life  Exposure  to  PAHs  and  Breast  Cancer  risk. 

A  paper,  “Breast  Cancer  Risk  and  Exposure  in  Early  Life  to  PolycyclicAromatic 
Hydrocarbons  Using  Total  Suspended  Particulates  as  a  Proxy  Measure”  has  been 
accepted  for  publication.  A  copy  of  the  complete  manuscript  is  included  in 
APPENDIX  III. 

A  paper,  “Environmental  Tobacco  Smoke  Exposure  in  Early  Life  eind  the  Risk  of  Breast 
Cancer”  has  been  submitted  for  publication.  A  complete  copy  of  the  manuscript  is 
included  in  APPENDIX  IV. 
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B.  Validation  ofthe  Traffic  Model 


Traffic  emission  is  a  big  component  of  air  pollution  and  a  major  source  of  PAHs 
in  urban  areas.  While  many  geographic  models  have  been  used  to  estimate  the  traffic 
related  air  pollution,  and  to  examine  its  relationship  to  disease  risk,  only  very  few  of  them 
were  specifically  designed  to  address  the  PAH  exposure.  Further,  the  validity  of  the 
model  and  its  ease  of  application  in  epidemiologic  studies  is  crucial.  Recently,  Dr.  Jan 
Beyea  and  colleagues  developed  a  traffic  PAH  model  for  Long  Island  Breast  Cancer 
project.  The  model  was  validated  using  both  spatial  and  temporal  data  collected  on  Long 
Island.  The  data  used  for  validation  and  calibration  included  PAH  measurements  carried 
out  on  a  subset  of  study  subjects,  such  as  soil  and  carpet  PAH  concentrations,  and  PAH- 
DNA  adducts  in  study  subjects’  blood.  Measurements  of  carbon  monoxide  (CO)  at  an 
EPA  monitoring  station  were  also  utilized,  because  a  PAH  traffic  model  can  also  predict 
relative  CO  concentrations  by  a  simple  choice  of  model  parameters.  They  concluded  that 
this  model  was  a  valid  tool  to  reconstruct  historical  PAH  exposure. 

Although  the  model  parameters  determined  from  Long  Island  ideally  should  be 
suitable  for  use  in  other  areas,  caution  still  needs  to  be  taken  before  we  can  apply  it  to  a 
different  geographic  location  and  study  settings.  While  Dr.  Beyea’ s  traffic  model  was 
originally  designed  to  calculate  the  cumulative  PAH  exposure  for  Long  Island  study,  it 
has  been  modified  to  estimate  the  average  PAH  exposure  in  a  slice  of  time,  i.e.  each 
critical  time  periods,  for  our  study  purpose.  To  examine  if  this  model  is  valid  in  our  study 
areas  and  also  to  further  calibrate  the  model  parameters,  we  performed  this  additional 
validation  study  using  data  from  Erie  and  Niagara  area. 

Methods 

Two  sources  of  data  were  utilized  to  conduct  the  validation  study,  i.e.  measured 
historical  benzopyrene  (BaP)  air  data,  and  existing  carbon  monoxide  data.  The  measured 
BaP  data  of  12  locations  were  taken  by  the  New  York  State  Department  of 
Environmental  Conservation  (NYSDEC)  from  November  1973  to  November  1974,  using 
thin  layer  chromatography  followed  by  fluorimetric  analysis.  Hourly  CO  data  was 
collected  through  the  USEPA/NYSDEC  for  two  monitoring  stations  in  Erie  County  and 
two  in  Niagara  county. 

All  four  CO  sites  were  visited  and  photographed  to  help  validate  the  latitudes  and 
longitudes  of  the  CO  monitors  listed  on  EPA  and  DEC  websites.  Information  collected 
during  each  site  visit,  e.g.  distance  from  the  monitoring  station  to  the  road,  height  of  the 
probe,  building  and  tree  structures  in  the  vicinity  of  the  stations,  also  helped  us  to  better 
understand  the  associations  between  the  model  predieted  and  measured  CO  level. 
Photographs  of  the  monitor  at  Site  0016,  which  had  been  removed  prior  to  our 
investigations,  were  obtained  from  NYSDEC.  Location  of  the  trailer  on  Site  2006,  which 
had  also  been  moved,  was  obtained  from  a  local  residence.  Since  there  is  no  traffic  count 
data  for  the  streets  close  to  this  monitor,  its  exact  location  is  not  critical.  The  trailers  for 
the  remaining  two  monitoring  sites  are  still  in  place.  The  test  year  was  taken  to  be  1993, 
which  was  the  earliest  year  for  which  we  had  reasonably  complete  data  for  most  of  the 
monitors.  For  the  station  at  Site  2008,  the  earliest  year  when  data  was  collected  was  1999. 

Traffic  count  data,  was  obtained  from  GBNRTC  and  NYDOT,  and  interpolation 
was  used  to  estimate  the  data  of  missing  years  as  described  in  the  traffic  model  section. 
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We  also  collected  24  hour  traffic  count  data  from  NYDOT  for  5  sites  (AVC  sites)  where 
traffic  data  was  continuously  collected  throughout  the  day  using  inductive  loop  and  axle 
sensors.  Based  on  this  information,  the  24  hours  traffic  curves,  i.e.  AVC  curves,  which 
show  the  variation  in  traffic  flow  over  the  day  were  calculated.  Hourly  emissions  of  PAH 
and  CO  per  km  of  road  are  taken  as  proportional  to  the  AVC  curves. 

To  fit  the  hourly  traffic  data  for  each  CO  site,  we  used  the  nearest  AVC  curve  to 
estimate  traffic  flow.  In  cases,  where  two  AVC  stations  were  at  comparable  distances,  we 
averaged  the  two  curves.  Since  the  model  predicted  CO  is  only  a  relative  value,  both 
measurements  and  predictions  were  normalized  to  the  values  for  the  unobstructed.  Site 
0005,  by  taking  the  ratio.  Since  we  did  not  have  overlapping  years  of  data  for  Site  2008, 
we  took  1993’s  measurement  to  be  the  same  as  the  1999’s  of  3.8  ppm,  based  on  the  fact 
that  changes  in  the  values  at  the  nearby  site,  2006,  did  not  show  a  statistically  significant 
trend  from  1993-1996  (p=0.34). 

Meteorological  data  for  study  areas  was  obtained  from  the  National  Climatic  Data 
Center  (NCDC  1999;  NCDC  2003). 

The  traffic  model  makes  use  of  additional  parameters,  such  as  PAH  deposition 
velocity,  photodecay  rates,  and  washout  rates  that  were  optimized  using  Long  Island  data. 

We  compared  the  measured  BaP  data  with  the  model  predictions,  using 
correlation  statistics  and  graphs.  For  CO  data,  2  comparisons  were  made:  1)  to  compare 
the  annual  average  CO  data  against  model  predictions;  2)  to  compare  hourly  predictions 
of  the  CO  concentrations  to  the  measured  values,  averaged  over  an  entire  year. 

Results 

Most  of  the  BaP  sites  were  in  a  city  and  near  a  major  industry,  while  only  one 
site,  i.e.  Site  11,  was  far  from  city  (Figure  1).  The  BaP  measurements  were  taken 
throughout  Nov.  1973  to  Nov.  1974,  with  variety  of  number  of  sample  collected,  ranging 
from  2  to  79  (Table  1). 

Figure  2  shows  the  locations  of  the  CO  monitoring  stations  and  the  AVC  hourly 
traffic  count  measurement  stations.  Also  shown  are  the  roads  for  which  traffic  coimt 
information  exists. 

Figure  3  shows  the  range  of  variation  of  the  hourly  traffic  count  from  site  to  site. 
Generally,  the  curves  show  morning  and  early  evening  peaks  corresponding  to  rush  hour 
traffic.  One  curve,  i.e.  Site  5383,  shows  a  midday  peak,  as  well. 

The  characteristics  of  the  4  CO  monitoring  stations  as  well  as  the  data  collected 
were  summarized  in  Table  2.  Site  0016  is  surrounded  by  large  buildings;  all  the  other  CO 
sites  are  unobstructed. 

Table  3  shows  the  correlation  between  model  predicted  BaP  and  measured  air 
BaP,  and  between  model  predicted  and  measured  hourly  CO  level.  The  correlation 
between  measured  and  predicted  BaP  was  0.54  (P=0.07);  the  correlations  between  the 
measured  and  predicted  hourly  CO  ranged  from  0.3 1  to  0.79  for  different  sites. 

Table  4  shows  the  annual  average  CO  measurement  and  prediction  with  or 
without  a  constant  background  CO  term  added  in  the  model.  These  data  have  also  been 
shown  on  graphs  (Figure  4  and  5).  Figure  4  indicates  that  the  model  under-predicts, 
particularly  at  Site  2006,  which  had  missing  count  data  in  the  immediate  vicinity  of  the 
monitor.  The  nearest  roads  with  traffic  counts  for  this  site  are  about  250  meters  from  the 
monitor.  To  move  this  point  up  to  the  measurement,  the  average  CO  concentration  would 
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have  to  be  doubled,  whieh  provides  an  estimate  of  the  error  rate  that  might  be  oecurring 
at  residences  far  from  major  roads.  For  the  Site  2008,  although  there  were  missing  data 
for  one  of  the  nearest  streets,  we  did  have  traffic  data  for  a  side  street  that  is  quite  close, 
resulting  in  little  under-estimation  of  the  measured  value. 

Discussion 

Despite  large  variations  about  the  regression  line,  the  predicted  BaP  air 
concentrations  correlated  with  the  measured  concentrations  and  the  relationship  between 
the  two  is  approximately  linear,  as  expected  (Figure  6).  The  correlation  is  modest  (0.54), 
which  is  not  too  surprising,  given  the  limited  information  about  the  air  measurements 
concerning  exact  sample  locations,  time  of  day  and  date  of  the  measurements,  varying 
number  of  samples  per  location.  Also,  some  of  the  PAH  detectors  were  deliberately 
placed  in  the  vicinity  of  industrial  emitters,  which  may  have  increased  the  variance  of  the 
data  compared  to  more  typical  locations  in  the  area.  Nevertheless,  the  large  variations 
from  the  regression  line  raise  the  possibility  of  large  uncertainty  in  predicting  PAH  air 
exposures  that  could  serve  to  mute  the  ability  to  find  a  dose  response,  if  one  exists. 

Although  the  traffic  model  was  designed  to  predict  traffic-related  PAH,  it  can  also 
predict  the  relative  concentrations  of  CO  from  traffic.  The  CO  model  is  equivalent  to 
running  the  PAH  model  with  certain  parameters  set  to  zero,  namely  “deposition 
velocity,”  “light  decay,”  and  “rain  washout.”  By  comparing  the  relative  CO  exposure, 
predicted  vs.  measured,  we  thus  check  the  dispersion  part  of  the  model,  but  not  the 
removal  processes. 

Using  graphs.  Figure  7  to  10,  we  compared  hourly  predictions  of  the  CO 
concentration  to  the  measured  values,  averaged  over  an  entire  year,  for  different  sites.  We 
show  the  results  vdth  the  constant  background  term  added,  because  it  improves  the  visual 
appearance  of  the  fit,  although  it  does  not  change  any  of  the  correlation  coefficient. 
Except  for  the  Site  0016,  the  meteorological  dispersion  patterns  have  changed  the 
temporal  emission  pattern  shown  in  Figure  3,  muting  the  afternoon  traffic  peak  and 
extending  the  curves  into  the  late  evening  and  early  morning  hours. 

For  the  Site  2006,  we  have  no  traffic  data  for  the  road  immediately  next  to  the 
monitor  station.  As  a  result,  the  distant  contributions,  with  muted  morning  peak  are  more 
important.  Would  traffic  data  for  the  closest  streets  be  added,  the  morning  peak  would 
increase.  This  gives  us  an  idea  of  what  will  happen  with  residences  that  are  relatively  far 
from  roads  with  traffic  counts  (Figure  9). 

The  presence  of  the  strong  afternoon  peak  in  the  measured  data  for  the  Site  0016 
and  the  absence  of  the  extended  evening  peak,  are  something  of  a  puzzle  (Figure  8). 
Possibly,  traffic  at  this  government  center  dropped  off  more  rapidly  after  6  pm  than  at  the 
sites  at  which  hourly  traffic  counts  are  available.  The  complex  dispersion  patterns  set  up 
by  the  tall  buildings  at  this  site  might  have  led  to  enhancement  of  the  afternoon  emission 
peak.  In  any  case,  these  fine  points  in  the  shape  of  the  temporal  pattern  have  little  to  do 
with  the  annual  average  values  that  are  of  interest  for  the  epidemiological  aspects  of  the 
study.  In  particular,  correlation  coefficients  for  hourly  patterns  do  not  reflect  the  average 
values,  which  are  of  interest  for  the  epidemiologic  study.  Still,  it  is  interesting  to  see  that, 
in  general,  the  meteorological  dispersion  model,  which  is  adding  up  contributions  from 
both  local  and  distant  streets  and  disbursing  them  differently  at  different  times  of  the  day, 
produces  output  that  approaches  the  measured  time  trend,  albeit  with  varying  success. 
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When  we  added  a  eonstant  background  term  into  the  traffic  model,  as  shown  in 
Figure  5,  brought  the  predictions  in  better  agreement  with  the  measurements,  although 
there  was  still  some  under  prediction  at  site  2006  (note  that  it  still  under-estimated  the 
CO  by  about  25%).  It  is  possible  that  contributions  from  roads  throughout  the  area  where 
traffic  counts  are  not  available,  along  with  industrial  emissions,  may  be  contributing  an 
overall  background  level  of  CO.  Residential  heating  is  unlikely  to  be  making  a  significant 
contribution,  because  examination  of  the  CO  data  by  season  shows  no  substantial  winter 
increase  that  might  have  occurred  from  space  heating  (Figure  11). 

In  summary,  nothing  in  our  validation  study  using  local  data  suggested  that  there 
need  be  any  significant  modifications  to  the  traffic  model,  as  calibrated  with  the  more 
extensive  Long  Island  data.  Based  on  this  study,  it  is  appropriate  to  use  the  model,  with 
site  specific  meteorology  and  traffic  count  data,  to  reconstruct  the  historical  PAH  levels 
in  Erie  and  Niagara  counties,  although  cautions  need  to  be  taken  for  people  lived  far 
away  from  major  roads  with  traffic  count. 
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Figure  1.  Locations  of  the  12  BaP  Sites 
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Table  1.  BaP  Data  for  12  locations  in  Erie  and  Niagara  County,  Nov.  1973  -  Nov.  1974 
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10  Ton.  CAM  (tonawanda  Two  Miles  Creek  Rd.  (779  Tonawanda  47  0.00222  0.012  0.00001 
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Figure  2.  Locations  of  the  5  AVC  Sites  and  4  CO  Monitoring  Stations 
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Table  2.  Descriptions  of  the  CO  Monitoring  Stations 
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Number  indicates  counter  designation.  Probe  heights  all  4.5  m. 


Table  3.  Correlation  Between  Model  Predicted  and  Measured  BaP/CO  Level 

Pearson  Correlation  Spearman  Correlation 
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Table  4.  Measured  and  Predicted  CO  at  4  different  locations  in  Erie  and  Niagara  County,  1993 

Site  Id  Annual  Average  Normalized  to  Site  Predicted  CO  Ratio  Predicted  CO  Ratio 
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Predicted  vs.  Measured  CO  Ratio  in  Different  Sites,  with  no  Backgroup  CO  Term  Added 


Predicted  vs.  Measured  CO  Ratio  in  Different  Sites,  with  Constant  Backgroup  CO  Term 

Added 


Figure  6.  Predicted  vs.  Measured  BaP  in  12  locations  in  Erie  and  Niagara  County  (Log 

transformed) 
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Figure  7.  Predicted  (with  Constant  Background  Added)  and  Measured  Hourly  CO,  Site  0005 
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Figure  8.  Predicted  (with  Constant  Background  Added)  and  Measured  Hourly  CO,  Site  0016 
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Predicted  (with  Constant  Background  Added)  and  Measured  Hourly  CO,  Site  2006 


Figure  10.  Predicted  (with  Constant  Background  Added)  and  Measured  Hourly  CO,  Site  2008 
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Measured  CO  at  Site  0005  by  Season,  1993 


C.  Traffic  PAH  Exposure  and  Risk  of  Breast  Cancer 

Traffic  is  one  of  the  major  sources  of  PAH  exposure  in  cities,  especially  after  the 
automobile  became  a  common  means  of  transportation.  While  some  studies  suggest  that 
food  is  the  predominant  source  of  PAH  for  human  exposure  (ATSDR),  the  estimates  of 
PAHs  from  diet  are  less  reliable  because  of  the  uncertainty  of  food  origins.  Study  also 
shows  that  dietary  sources  of  PAHs  are  only  weakly  correlated  with  internal  PAH  dose, 
for  example  PAH-DNA  adduct.  According  to  reports  from  ATSDR,  the  major  sources  of 
PAH  air  exposure  in  the  U.S.  are  wood  burning  and  traffic  emissions.  Mobile  emissions 
account  for  20%  of  the  total  PAH  emissions,  and  this  is  similar  to  reports  from  6 
European  countries.  The  percentage  of  traffic  emission  is  even  higher  in  urban  areas.  One 
study  found  that  motor  vehicles  may  account  for  up  to  90%  of  the  total  particle-bound 
PAH  in  the  downtown  area  of  a  city.  This  has  been  confirmed  by  another  study  in  the 
UK.  Traffic  emissions  are  one  of  the  best  characterized  important  sources  of  PAHs.  The 
amount  of  PAHs  in  traffic  emissions  is  determined  by  the  type  and  parameters  of  fuel 
used,  driving  conditions,  temperature,  exhaust  treatment,  and  engine  adjustment. 

In  this  study,  we  used  a  GIS  traffic  model,  developed  by  Dr.  Jan  Beyea  and 
colleagues,  to  estimate  the  historical  residential  exposure  to  PAHs  from  traffic.  As 
mentioned  in  the  previous  section  chapter,  this  traffic  model  has  been  validated  as  a 
useful  tool  to  reconstruct  historical  PAH  exposure  from  residential  locations,  using  data 
collected  from  Long-Island  Breast  Cancer  Study,  as  well  as  some  additional  data  from 
our  study  area.  The  association  between  the  traffic  PAH  exposure  and  the  breast  cancer 
risk  was  examined  for  time  period  of  potential  breast  cancer  development,  i.e.  at 
menarche,  first  birth,  20  and  10  years  prior  to  cancer  diagnosis. 

Methods 
Data  Collection 

Data  from  several  sources  were  collected  to  determine  the  PAHs  from  vehicle 
emissions. 

Historical  county  traffic  volumes  were  obtained  from  the  Greater  Buffalo-Niagara 
Regional  Transportation  Council  (GBNRTC)  for  the  years  from  1971  to  2002,  and  the 
New  York  State  Department  of  Transportation  (NYDOT)  for  the  years  between  1960  and 
1975.  In  both  sources,  the  traffic  volume  was  recorded  for  each  segment  (with  a  start  and 
end  point)  which  may  approximately  have  similar  traffic  flow.  The  length  of  each 
segment  varied  from  0.1  to  10  miles.  While  the  NYDOT  data  provide  us  only  the  data  for 
touring  route  system,  the  one  from  GBNRTC  contains  also  local  highway  data.  Five 
functional  classifications  of  roads  are  available  in  the  GBNRTC  traffic  system,  including 
interstate,  expressway,  principal  arterial,  minor  arterial,  and  collector  (Figure  1),  covering 
the  major  roads  in  the  traffic  network  (Figure  2).  Annual  Average  Daily  Traffic  (AADT) 
in  both  data  sources  represents  the  total  traffic  volume  in  both  directions,  taking  into 
consideration  of  the  types  of  vehicle  and  seasonality. 

Tailpipe  emission  data  were  collected  from  previous  journals  and  reports, 
including  measurements  carried  out  in  tunnels  or  on  individual  vehicles  run  in  place  on 
test  beds.  Based  on  these  raw  data,  two  curves,  i.e.  tunnel  fit  and  scaled  dynamometer  fit 
were  developed  to  model  the  historical  tailpipe  emission  (Figure  3). 
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Meteorological  data  were  obtained  from  Environmental  Protection  Agency  (EPA) 
and  the  National  Climatic  Data  Center.  These  data  include  wind  speed,  direction, 
“stability  class,”  or  equivalently  temperatures  at  two  heights  on  the  meteorological  tower, 
and  twice-daily  “mixing  height”. 

Exposure  Assessment 

In  our  traffic  Model,  the  total  traffic  PAHs  emissions  composed  three  terms,  i.e. 
cruise  (warm  engine)  emissions,  cold  engine  emissions  and  intersection  emissions.  Two 
separate  weights,  generated  from  Long  Island  Breast  Cancer  Study  and  the  validation 
results  using  local  data,  were  applied  to  the  model  to  adjust  for  the  higher  emission  of  the 
cold  engine  and  intersection.  To  obtain  the  indoor  PAH  exposure,  we  applied  a  building 
penetration  factor  (0.75)  into  the  total  traffic  PAHs  emissions. 

1)  Cruise  emissions:  Computed  as  the  product  of  tailpipe  emission  and  traffic 
counts  in  the  road  network,  i.e.  Emissions=tailpipe  emissions  per  vehicle-km  *  traffic 
count  *  road  segment  length. 

Historical  traffic  volume  information  in  Erie  and  Niagara  Coimties  were  extracted 
from  the  reports  of  GBNRTC  (1971-2002)  and  NYDOT  (1960-1975,  State  roads  only), 
with  AADT  as  the  basic  unit  of  measurement.  We  then  assigned  these  traffic  volumes  to 
each  of  the  54,494  major  road  segments  in  the  two  study  counties,  using  the  nearest 
available  measurement  on  that  road  within  10  km,  and  we  repeated  this  for  all  the  years 
in  the  study  windows. 

Traffic  volume  was  not  monitored  every  year  which  resulted  in  gaps  in  the  traffic 
data.  Interpolation  or  extrapolation  (within  10  years)  was  used  to  estimate  the  traffic 
volume  in  missing  years,  when  multiple-year  traffic  data  were  collected  for  a  point; 
countywide  traffic  growth  rates  were  used  to  fill  these  gaps,  when  only  one-year  traffic 
data  was  collected. 

In  this  study,  we  ignored  the  effect  of  PAH  exposure  from  areas  surrounding  the 
two  study  counties.  These  should  add  little  because  the  study  region  is  bounded  by  Lake 
Erie  and  suburban  area  with  very  low  traffic  flow. 

Since  traffic  data  started  in  1960,  we  did  not  use  logarithmic  extrapolation  for 
cruise  emissions  before  that  time,  due  to  a  concern  regarding  misclassification.  Similarly, 
this  same  rule  was  applied  for  the  calculation  of  cold  engine  and  intersection  emissions. 

2)  Cold  engine  emissions:  Cold  engine  emissions  usually  contain  higher  levels  of 
PAHs  than  warm  engine  emissions,  thus  we  calculated  them  separately.  Similar  to  warm 
engine,  we  collected  historical  AADT  and  tailpipe  emissions  to  construct  the  cold  engine 
emissions. 

AADT  was  first  calculated  in  the  census  block  level,  estimated  as  the  product  of 
total  number  of  cold  starts  per  household  per  day  and  the  number  of  households  in  each 
census  block,  and  then  was  assigned  to  the  roads  within  the  census  block.  We  obtained 
the  number  of  cold  starts  in  1995  from  the  Nationwide  Personal  Transportation  Survey 
(NPTS  1995),  and  estimated  the  number  of  cold  starts  between  1960  to  1995  by  scaling 
from  the  national  figures  from  Hu.  The  number  of  households  was  collected  from  the 
historical  US  census  data.  We  assumed  that  cold  engine  emissions  would  last  for  1km 
travel  distance.  Once  the  AADT  in  each  census  block  was  calculated,  we  then  assigned 
them  among  the  roads,  using  the  inverse  square  distance  to  them  from  the  centroid  of  the 
census  block.  We  set  1km  as  the  total  trip  length  from  the  center  of  the  centroid  to  the  last 
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point  included  on  a  major  road,  and  we  attempted  to  assign  the  emissions  uniformly  to  all 
local  residential  street  segments  lying  with  a  census  block. 

3)  Intersection  emissions:  Since  accelerating  and  decelerating  in  the  intersection 
may  increase  emissions,  we  assigned  a  weight  to  the  segments  within  200  meters  of  an 
intersection.  Since  there  was  no  detailed  information  about  the  traffic  control  at 
intersections,  we  assumed  that  10%  of  the  traffic  was  exiting  or  entering,  thus  emitting 
more  PAHs. 

Statistical  Analysis 

All  analyses  were  done  separately  for  pre-  and  post-menopausal  women  unless 
otherwise  specified.  In  our  study,  women  were  defined  as  post-menopausal  using  specific 
criteria  obtained  during  the  personal  interview.  We  will  take  into  account:  hormone  use 
(present  or  past),  reason  for  cessation  of  menses  (natural,  surgical,  chemotherapy, 
radiation,  etc)  and  age.  All  women  60  years  of  age  were  considered  post-menopausal. 

To  describe  the  distribution  of  the  studied  variables,  means  and  standard 
deviations  (SDs)  were  presented  for  the  continuous  variables  between  cases  and  controls 
groups,  and  T-tests  were  used  to  compare  means.  Chi-square  tests  were  used  for 
categorical  variables.  Peeirson  correlations  were  presented  among  the  continuous 
variables  to  examine  the  interdependency  of  these  variables. 

Unconditional  Logistic  regression  were  used  to  calculate  the  odds  ratios  (OR)  and 
95%  confidence  intervals  (Cl).  As  the  linear  dose  response  relationship  is  assumed 
between  PAH  exposure  and  breast  cancer  risk,  we  entered  the  PAH  estimates  from  the 
geographic  models  as  a  continuous  variable  into  the  regression  model,  to  test  the  linear 
trend.  The  common  breast  cancer  risk  factors  as  potential  confounding  variables  were 
adjusted  in  the  model,  including  age,  education,  race,  BMI,  age  at  first  birth,  age  at 
menarche,  age  at  menopause  (for  post-menopausal  women  only),  number  of  births,  first- 
degree  relative  with  breast  cancer,  previous  benign  breast  disease.  To  test  the  potential 
effect  modification,  the  product  of  PAHs  estimates  and  smoking/diet  were  introduced 
into  the  regression  model,  and  stratified  analyses  were  performed.  Stratified  analyses  by 
ER/PR  status  were  also  conducted. 

Due  to  the  fact  that  PAH  exposure  estimates  calculated  in  this  study  showed  a 
skewed  distribution,  natural  log  was  taken  for  all  PAH  exposure  estimates  to  make  it 
normally  distributed.  All  the  PAH  exposure  estimates  mentioned  below  therefore  are  the 
ones  taken  natural  log,  unless  otherwise  specified. 

Results 

The  characteristics  of  breast  cancer  cases  and  controls  by  menopausal  status  are 
listed  in  Table  1-1  and  Table  1-2.  Although  in  our  study,  controls  were  matched  to  cases 
by  age  and  race,  small  differences  still  existed  which  might  due  to  the  fact  of  only 
frequency  matching  and/or  only  small  percentage  of  the  study  participants  included  in 
each  time  period  analyses.  For  example,  among  post-menopausal  women  in  menarche 
analyses,  cases  were  on  average  2  years  older  than  controls  (49.4  vs.  47.3  years  old), 
noticing  only  128  women  (i.e.  52  cases  and  76  controls)  were  included  in  these 
comparisons;  among  post-menopausal  women  in  20  years  prior  analyses,  there  were  more 
whites  in  cases  than  in  controls,  93.3%  vs.  90.4%.  The  distribution  of  education  and  BMI 
were  similar  between  cases  and  controls. 
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Correlations  between  traffic  model  predicted  PAHs  and  the  control  variables  are 
listed  in  Table  2-1  to  Table  2-4.  Using  different  parameter  settings,  which  includes  Long 
Island/Buffalo  intersection  weights  and  with/without  problematic  traffic  counts  included, 
we  obtained  four  sets  of  PAH  estimates,  i.e.  BaP  (R5)  to  BaP  (R8).  For  all  time  periods 
and  for  both  pre-  and  post-menopausal  women,  the  corrections  among  them  are  very 
high.  The  nonparametric  correlations  among  these  different  estimates  are  even  stronger, 
indicating  that  most  of  the  subjects  will  fall  into  the  same  quartile  category  in  all 
following  regression  analyses  no  matter  which  estimates  were  selected.  Thus  we  decided 
to  use  BaP  (R6),  where  the  Buffalo  intersection  weights  were  used  and  problematic 
traffic  counts  information  were  excluded,  as  the  only  PAH  estimate  in  all  subsequent 
analyses.  Distance  to  the  closest  road  with  traffic  count  and  year  at  exposure  were 
negatively  correlated  with  PAH  estimates,  while  traffic  count  in  1990  was  positively 
correlated  with  PAH  estimates.  This  showed  that  the  PAH  estimates  were  a  combination 
effects  of  distance  to  closest  road,  year  at  exposure  and  traffic  count.  Worth  mention  that 
year  at  interview  was  negatively  correlated  with  PAH  estimates.  Due  to  the  fact  that  the 
cases  were  generally  interviewed  later  than  controls,  year  at  interview  was  considered  as 
a  controlled  variable  in  all  subsequent  regression  analyses.  Further,  all  other  correlations 
were  not  strong. 

The  correlations  within  the  control  variables  were  not  strong.  Age  was  generally 
positively  correlated  with  age  at  menarche,  age  at  menopause  and  parity;  and  BMI 
negatively  correlated  with  years  of  education  and  age  at  menarche. 

The  crude  and  adjusted  ORs  are  shown  in  Table  3.  For  the  crude  comparison, 
compared  to  controls,  the  cases  were  more  likely  to  have  a  higher  PAH  exposure  at 
menarche  and  at  women’  first  birth.  At  menarche,  among  pre-menopausal  women,  OR  is 
1 .99  for  the  highest  quartile.  At  first  birth,  among  post-menopausal  women,  OR  is  1.61  in 
the  highest  quartile.  However,  this  increased  risk  was  not  seen  among  the  post¬ 
menopausal  women  in  the  menarche  analyses  and  the  pre-menopausal  women  in  first 
birth  analyses.  In  consideration  of  the  fact  that  controls  were  generally  interviewed 
earlier  than  controls,  and  air  PAH  dropped  through  the  years,  year  of  interview  were 
entered  into  the  adjusted  models.  After  controlling  for  year  at  interview  as  well  as  other 
factors,  i.e.  age,  education,  race,  BMI,  age  at  first  birth,  age  at  menarche,  age  at 
menopause,  number  of  births,  first-degree  relative  with  breast  cancer,  previous  benign 
breast  disease,  the  suggested  increased  risk  at  menarche  and  first  birth  analyses  still 
existed.  Traffic  PAH  estimates  were  also  entered  into  the  regression  models  as  a 
continuous  variable  to  examine  if  there  was  a  potential  dose-response  effect  between 
PAH  exposure  and  breast  cancer  risk.  We  found,  only  among  pre-menopausal  women  at 
menarche  analyses,  the  P  for  trend  was  significant  (p=0.03). 

To  deal  with  the  problem  that  the  traffic  model  may  underestimate  the  PAH 
exposure  for  women  lived  far  from  a  road  with  traffic  counts,  suggested  by  the  traffic 
model  validation  results  mentioned  in  previous  chapter,  we  restricted  the  analyses  to 
women  living  within  250  meters  from  a  road  with  traffic  counts,  the  primary  findings 
remain  unchanged  (Table  4). 

To  better  explore  the  potential  confounding  effect  of  social  economic  status  in  our 
study,  we  tried  adjusting  income  instead  of  education  (or  both  income  and  education)  in 
the  analyses.  The  results  showed  that  using  these  different  sets  of  variables  in  adjustment 
did  not  significantly  change  our  results  (Table  5  and  6). 
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When  we  stratified  the  subjects  by  smoking  status,  the  increased  risk  was  only 
seen  in  non-smokers  (Table  9  and  10).  Among  pre-menopausal  women  at  menarche 
analyses,  cases  were  about  7  times  more  likely  to  be  in  the  highest  quartile  group  than 
controls.  We  found  increased  risk  of  breast  cancer  for  both  pre-  and  post-menopausal 
women  at  the  first  birth  analyses  (Table  9). 

To  examine  if  there  is  any  confounding  effect  of  smoking,  we  entered  packyear  of 
smoking  in  the  regression  model.  Controlling  for  smoking  did  not  change  the  previous 
findings  (Table  11). 

Effect  modification  by  ER/PR  status  was  examined.  In  these  analyses,  the  whole 
set  of  controls  were  included  in  all  regressions.  As  indicated  in  Table  12  and  13,  the 
results  were  similar  for  both  ER(-)  and  ER(+)  strata,  and  there  was  no  increased  breast 
cancer  risk  in  the  higher  PAH  exposure  categories.  Stratification  analyses  by  PR  status 
showed  the  similar  results,  except  for  at  first  birth  analyses,  where  the  PR(+)  analyses 
suggested  a  little  stronger  increased  breast  cancer  risk  among  women  exposed  to  higher 
PAHs,  comparing  to  PR(-)  (Table  14  and  15). 

Discussion 

Studies  have  found  exposed  to  traffic  emissions  associated  with  increased  risk  of 
total  childhood  cancer  and  childhood  leukemia,  but  to  our  knowledge,  this  is  one  of  the 
first  studies  that  examined  traffic  emission  and  breast  cancer  risk. 

Using  a  GIS  traffic  model  to  estimate  the  residential  exposure  to  PAHs  from 
traffic  in  potential  critical  time  periods  of  breast  cancer  development,  i.e.  at  menarche  and 
at  first  birth,  our  study  also  indicated  that  women  exposed  to  higher  level  of  traffic  PAHs 
were  associated  with  higher  risk  of  breast  cancer,  and  this  increased  risk  was  only  existed 
in  non-smokers. 

The  fact  that  increased  breast  cancer  risk  was  only  observed  in  earlier  life  time 
period,  i.e.  at  menarche  and  first  birth,  may  suggest  that  the  observed  association  is  due  to 
the  general  high  level  of  PAH  exposure  in  earlier  years.  Additional  analyses  by  years  of 
exposure  were  conducted  to  clarify  this.  Five  years  were  selected,  namely  1960,  1965, 
1970, 1980,  and  1990.  Only  addresses  that  the  participants  resided  in  during  each  of  these 
5  years  were  included  in  each  subset  analysis.  The  increased  risk  was  found  only  in  the 
years  1960, 1965  and  1970  analyses  among  pre-menopausal  women,  but  not  post¬ 
menopausal  women  (Table  16).  These  findings  may  also  provide  additional  evidence  to 
our  hypothesis  of  the  importance  of  earlier  life  exposure. 

The  striking  difference  for  analyses  stratified  by  smoking  status  is  quite 
interesting.  Although  smoking  is  another  important  source  of  PAHs,  most  previous 
studies  did  not  find  an  association  between  smoking  and  breast  cancer  risk.  Our  study 
suggest  that  among  smokers  whose  PAH  exposure  may  be  already  high,  additional 
exposure  from  traffic  may  not  make  a  different  in  term  of  breast  cancer  risk.  However, 
2imong  non-smoker  whose  PAH  exposure  from  smoking  are  potentially  low,  increased 
traffic  exposure  may  greatly  increase  breast  cancer  risk. 

Study  has  suggested  that  in  situ  cancer  may  be  more  sensitive  to  environmental 
exposure  with  geographic  variation.  Although  there  seem  to  be  some  indications  of  this, 
our  results  are  limited  by  the  small  sample  size  (Table  7-1  and  8).  Since  we  only  have 
very  few  women  with  in-situ  breast  cancer  in  our  study,  we  do  not  really  have  enough 
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power  to  draw  any  conclusions  regarding  this,  even  after  we  tried  to  adjust  fewer 
covariates  in  reduced  models  (Table  7-2). 

Although  our  model  is  novel  way  of  using  a  geographic  traffic  model  to 
reconstruct  the  historical  PAH  exposure,  this  method  has  many  limitations.  1)  Row 
housing:  Since  the  percentage  of  row  housing  is  small  in  the  study  area,  we  did  not 
incorporate  “canyon  effects”  into  the  modeling.  2)  Historical  changes  in  road  network:  In 
spite  of  the  roads  changes  over  time,  we  did  not  include  an  algorithm  to  “remove”  road 
sections  from  the  network  in  backwards  extrapolation.  This  means  that  in  a  few  cases, 
we  might  overestimate  exposures  in  the  early  years  due  to  the  inclusion  of  emissions 
from  roads  not  yet  built.  3)  In-vehicle  exposures:  We  did  not  account  for  exposure  to 
traffic  PAHs  while  driving.  4)  Neglect  of  other  PAHs  air  exposures:  Our  study  is 
premised  on  the  experimental  finding  that  traffic  exposures  are  a  dominant  source  of 
airborne  PAHs,  both  in  and  outdoors.  Thus,  although  we  neglected  emissions  from 
indoor  sources  (e.g.,  PAHs  in  airborne  combustion  products  of  cigarette  smoke  and 
cooking),  we  expect  that  the  amount  of  PAHs  from  traffic  alone  is  sufficient  to  allow  its 
detection  in  breast  cancer  risk,  if  there  does  exist  an  association  between  breast  cancer 
and  the  PAHs  delivered  on  ultra  fine  traffic  particulates.  5)  Limited  data  on  earlier 
exposure:  Due  to  the  limited  year  of  the  traffic  count  information  that  was  available,  we 
were  not  able  to  include  resident  addresses  of  prior  to  1960  in  this  study.  For  this  reason, 
we  were  not  able  to  examine  the  association  between  PAHs  at  birth  and  breast  cancer 
(noting  that  majority  of  the  participants  were  bom  before  1960),  although  exposure  at 
birth  is  another  very  important  critical  time  window  in  term  of  development  of  breast 
cancer.  For  the  same  reason,  the  sample  size  for  both  menarche  and  first  birth  analyses 
was  greatly  reduced. 

As  a  case-control  study  design,  this  study  may  be  prone  to  different  types  of  bias. 
Selection  bias  may  be  a  concern  especially  as  we  have  relative  low  participant  rate. 
However,  as  cases  living  near  our  study  center  were  more  likely  to  participate,  we  found 
the  same  pattern  in  controls.  And  the  other  common  breast  cancer  risk  factors  were  not 
different  between  subjects  included  and  excluded,  except  that  people  exclude  is  a  little 
older,  and  more  likely  to  have  3  or  more  children.  Recall  bias  is  not  likely  happen  in  this 
study.  At  the  time  of  study,  participants  were  generally  unaware  about  our  study 
hypotheses.  Misclassification  do  exists  from  the  self-report  residence,  as  well  as  from 
using  traffic  model  to  estimate  PAH  exposure,  however,  it’s  more  likely  to  be  non¬ 
differential  and  may  draw  study  results  to  null.  As  we  indicated  in  the  previous  chapter  of 
model  validation,  there  was  a  potential  problem  of  underestimating  exposure  for  people 
live  far  away  from  major  road  with  traffic  counts.  However,  restricting  people  to  only 
those  who  lived  close  did  not  change  our  primary  findings. 

In  this  study,  most  of  the  study  participants  provided  their  lifetime  residential 
history;  and  significant  efforts  have  been  made  to  reconstruct  historical  PAH  exposure. 
This  allowed  us  to  examine  the  relationship  between  breast  cancer  and  PAHs  in  different 
time  windows,  especially  those  critical  time  periods  of  breast  tissue  development. 

Our  study  population  was  a  residentially  stable  population.  While  the  mean  age  of 
our  subjects  was  about  56  years,  people  were  in  average  moving  5  times  lifetime.  The 
stable  of  this  population  gave  us  unique  opportunity  to  examine  the  long-term  effect  of 
PAHs,  and  made  examining  in  only  critical  time  windows  more  meaningful.  Subjects 
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were  likely  to  have  been  staying  in  the  same  residence  for  many  years  before  the  time 
period  we  examined. 

Further,  common  risk  factors  for  breast  cancer  and  other  characteristics  of  the 
study  participants  have  been  collected  by  interviews  and  self-administered 
questionnaires.  This  made  it  possible  to  examine  the  potential  effect  of  confounders  and 
effect-modifiers  in  the  models.  We  also  have  some  data  on  other  sources  of  PAHs,  such 
as  smoking,  which  allowed  us  to  examine  and  adjust  their  effects  in  the  traffic  models. 

In  summary,  using  a  geographic  model,  we  found  some  evidence  of  high  traffic 
PAH  exposure  associated  with  increased  risk  of  breast  cancer,  particularly  for  the  earlier 
time  exposure.  This  study  provides  additional  evidence  to  the  hypothesized  association 
between  PAHs  and  breast  cancer.  However,  in  interpretation  of  the  study  results,  there 
must  be  caution  because  of  study  limitations  mentioned  earlier.  Future  studies  are  need, 
for  both  critical  time  windows  and  lifetime  exposure,  to  confirm  our  findings  in  this 
study. 
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Figure  1.  Functional  Classification  of  the  Traffic  System  in  GBNRTC 


Figure  2.  Density  of  the  Roads  with  Traffic  Count  Information 
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Figure  3.  Tailpipe  Emission  Data  Collected  and  Model  Fits 


(ui>|/Bn)  deg 
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Table  1-1.  Characteristics  of  Study  Sample  by  Case-Control  Status,  in  menarche  analyses 
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PAH  exposure,  run8  9.3  (0.9)*  9.1  (1.1)*  9.0  (1.0)  9.1  (1.0) 


Distance  to  closest  road  with  172.9(213.3)*  232.2(458.2)*  164.4(294.8)  197.5(446.5) 

traffic  count  (meters)  (median=116.5)  (median=135.5)  (median=88.9)  (median=102.8) 

Number  of  years  stay  in  this  16.7  (6.9)  17.3  (6.2)  14.6  (6.4)  15.7  (8.3) 

address 

AADT  in  1990  of  the  closest  road  12515.4  (15962.1)  11922.1  (15398.8)  11819.9  (16533.3)  12250.8  (16088.1 
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Table  1-2.  Characteristics  of  Study  Sample  by  Case-Control  Status,  in  first  birth  anaiyses 
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Distance  to  closest  road  with  199.8  (233.6)  244.3  (431.4)  167.7  (193.0)  183.9  (310.8) 

traffic  count  (meters)  (median=135.5)  (median=132.0)  (median=119.7)  (median=123.2) 

Number  of  years  stay  in  this  9.6  (7.6)  10.6(8.0)  11.6(12.3)  11.0(10.7) 

address 

AADT  in  1990  ofthe  closest  road  11671.8(13271.5)  12140.7(13150.2)  12254.7(9694.3)  12518.6(15251.2 
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Table  2-2.  Correlations  among  the  variables,  in  Menarche  analyses,  post-menopausal  women  _ _ _ 

Yrs  Dist_close  Yrsat  AADT9  Yea  Age  Edu.  BM  Age  at  Age  at  Parit  BaP  BaP(R  BaP(R  BaP(R  Incom  Smoking 

at  Rd  Addr.  0  r  at  (year  I  Mena  Meno  y  (R5)  6)  7)  8)  e  (Packyea 

App  (n=12  Exp.  )  r.  p.  r) 
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Table  2-4.  Correlations  among  the  variables,  in  First-birth  analyses,  post-menopausal 


Yrs  Dist_close  Yrs  at  AADT9  Yea  Age  Edu.  BMI  Age  at  Age  at  Parit  BaP  BaP(R  BaP(R  BaP(R  Incom  Smoking 

at  Rd  Addr.  0  r  at  (year  Mena  Meno  y  (R5)  6)  7)  8)  e  (Packyea 

App  (n=52  Exp.  )  r.  p.  r) 
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Smoking  -0.04  0.04  -0.03  0.02  -  0.03  -  0.03  -0.08  -0.09*  0.00 

(Packyear)  0.15  0.13* 


Table  3.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer 


(D 

U 

cS 

cu 


oi 

O 

W  c 
^  o 

3  w 


xi 

C 


u 

V — ' 

Pi 

o 

a 

"Ti 

B 

a 


I 

O 

U 


CO 

(D 

to 

cd 

O 


4: 


\0  fTi  t> 

rA  t>  00 

yf  rf  tT 

O  SO  CTn  TT 

O  00  00  On 
•  •  •  • 
1-H  o  o  o 


On 


o 

ri  r4 


0) 

o 

I 

(U 

s 


^  -JC  ^ 

Os 

00  00  ^ 
:  On  Tf 

fsj  •  • 

o  I  I 

O  ^  ^ 

•  •  00  tH 

tH  O  •  • 

w  1-H  t-H 
w- 

£f5 

'*  o^ 


cd 

O) 

§ 

P. 

2  VO  OS  ID 
s  00  00  00  00 

a 


a 

Oh 


Os  OS  n  l> 
<S  Tf  IT)  IT) 


o 


o  00  OS 

fH 

r4  r4 

I  I  I 

Tf  00  00 
<s  o  o 


fO  1-H  VO 
O  Tf  Tf 


/— V 

/-*N 

00 

SO 

fO 

IT) 

OS 

00 

0 

tT 

r4 

rH 

tH 

(S 

lA 

0 

SO 

fS 

fO 

TT 

fs 

0 

ifi 

iTi 

0 

0 

0 

0 

w- 

On 

00 

On 

00 

tH 

fO 

SO 

rj 

00 

00 

0 

0 

rH 

0 

0 

cd 

GO 

u 

Oh 

O 

P  OS  OS 

F-I  th 

a 

GO 

O 

PLh 


VO  On 


OS  On 


VO 


PQ 


IX, 


OS 

so 


fS  IT) 

o 

fS 

I  I  I 

O  OS  so  o 
O  00  fO 
•  •  •  • 

^  S.S.S 

VO  VO  so 
OS  00  fO 
yH  O  ^ 


■53 

c/5 

§ 

cx 

§ 

0) 

a 

a 

Oh 


fS  Tf  fO 
Os  Os  Os 


I/)  VO  o  o 

m  Tf  Tf 


cd 

t« 

§ 

Ph 

B  r- 

2  t> 

a 

'M 

CO 

O 

PL, 


On 

m 


0) 

I 

o 

& 


cx 

4h 

O 

CO 

(D 


P 

CX 


IT) 

95 

rs 

SO 

VO 

0 

rr 

▼H 

so 

IT) 

fO 

00 

pO 

bX) 

•  pal 

Tf 

i> 

pd 

so 

▼H 

pd 

0£ 

•laN 

06 

06 

0 

06 

06 

P 

06 

d 

u 

2 

0 

u 

x 

pX3 

2 

2 

£ 

C 

pd 

p 

X 

d 

S 

X 

pd 

CM 

e 

0 

su 

pO 

'M 

CM 

5 

XI 

d 

pjd 

V3 

pd 

pd 

d 

X 

pd 

CM 

VO 

tn 

tH 

0^ 

'M 

VO 

fO 

Tf 

TT 

fs 

Tf 

VO 

0 

06 

00 

0 

00 

0 

VO 

0 

•  • 

•  • 

• 

00 

•• 

•  • 

• 

00 

•  m 

•  • 

06 

h:) 

•  • 

wH 

r<i 

•  • 

Tf 

•  • 

<s 

ro 

•  • 

•  • 

T— 1 

n 

•  • 

68 


2: 7.57  thru  8.35  67  77  1.72(1.04-2.85)*  2.09(0.93-4.67) 

3: 8.35  thru  8.76  53  78  1.34(0.80-2.26)  1.19(0.51-2.77) 

4;  8.76  thru  Highest _ 62 _ 76 _ 1.61  (0.97-2.68)  2.58(1.14-5.82)*  0.19 


*  Odds  ratios  and  95%  confidence  intervals  adjusted  for  age,  education,  race,  BMI  (1  yr  before  interview),  age  at 
first  birth,  age  at  menarche,  age  at  menopause  (for  post-menopausal  women  only),  number  of  births,  first-degree 
relative  with  breast  cancer,  previous  benign  breast  disease  and  year  at  interview. 
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*  Odds  ratios  and  95%  confidence  intervals  adjusted  for  age,  education,  race,  BMI  (1  yr  before  interview),  age  at 
first  birth,  age  at  menarche,  age  at  menopause  (for  post-menopausal  women  only),  number  of  births,  first-degree 
relative  with  breast  cancer,  previous  benign  breast  disease  and  year  at  interview. 
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Table  5.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (adjusting  for  income  instead  of 

education) 
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1.00  1.00 

1.67  (0.97-2.86)  1.97  (0.82-4.78) 

1.41  (0.81-2.45)  1.21  (0.48-3.00) 

1.78(1.04-3.04)*  2.87(1.19-6.91)*  0.11 


*  Odds  ratios  and  95%  confidence  intervals  adjusted  for  age,  income,  race,  BMI  (1  yr  before  interview),  age  at 
first  birth,  age  at  menarche,  age  at  menopause  (for  post-menopausal  women  only),  number  of  births,  first-degree 
relative  with  breast  cancer,  previous  benign  breast  disease  and  year  at  interview. 
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Table  6.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (adjusting  for  both  income  and  education) 
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1.67  (0.97-2.86)  1.98  (0.82-4.78) 

1.41  (0.81-2.45)  1.21  (0.48-3.00) 

1.78(1.04-3.04)*  2.87(1.19-6.92)*  0.11 


*  Odds  ratios  and  95%  confidence  intervals  adjusted  for  age,  education,  income,  race,  BMI  (1  yr  before 
interview),  age  at  first  birth,  age  at  menarche,  age  at  menopause  (for  post-menopausal  women  only),  number  of  births, 
first-degree  relative  with  breast  cancer,  previous  benign  breast  disease  and  year  at  interview. 
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Table  7-1.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (in  Situ  cancer  only) 
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Table  7-2.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (in  Situ  cancer  only),  model  2,  with 

fewer  covariates 
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Table  8.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (non-inSitu  cancer  only) 
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1.62  (0.95-2.75)  2.18  (0.95-5.00) 

1.39  (0.81-2.39)  1.09  (0.46-2.60) 

1.55  (0.91-2.65)  2.42(1.03-5.67)*  0.51 


Table  9.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (non-smoker  only) 
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*  Odds  ratios  and  95%  confidence  intervals  adjusted  for  age,  education,  race,  BMI  (1  yr  before  interview),  age  at 
first  birth,  age  at  menarche,  age  at  menopause  (for  post-menopausal  women  only),  number  of  births,  first-degree 
relative  with  breast  cancer,  previous  benign  breast  disease  and  year  at  interview. 
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Table  10.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (smoker  only) 
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1.26  (0.64-2.50)  1.14  (0.38-3.40) 

1.24  (0.63-2.42)  0.75  (0.25-2.27) 

1.16  (0.59-2.29)  1.24  (0.42-3.68)  0.94 


*  Odds  ratios  and  95%  confidence  intervals  adjusted  for  age,  education,  race,  BMI  (1  yr  before  interview),  age  at 
first  birth,  age  at  menarche,  age  at  menopause  (for  post-menopausal  women  only),  number  of  births,  first-degree 
relative  with  breast  cancer,  previous  benign  breast  disease  and  year  at  interview. 
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Table  11.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (adjust  for  packyear  of  smoking) 
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1.72(1.04-2.85)*  2.19  (0.97-4.93) 

1.34  (0.80-2.26)  1.31  (0.56-3.07) 

1.61  (0.97-2.68)  2.78(1.22-6.32)*  0.15 


*  Odds  ratios  and  95%  confidence  intervals  adjusted  for  age,  education,  race,  BMI  (1  yr  before  interview),  age  at 
first  birth,  age  at  menarche,  age  at  menopause  (for  post-menopausal  women  only),  number  of  births,  first-degree 
relative  with  breast  cancer,  previous  benign  breast  disease,  year  at  interview  and  pack  year  of  smoking. 
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Table  12.  Traffic  PAH  Exposiire  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (ER-  only) 
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Table  13.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (ER+  only) 
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Table  14.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (PR-  only) 
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Table  15.  Traffic  PAH  Exposvire  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (PR+  only) 
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Table  16.  Traffic  PAH  Exposure  in  Different  Years  and  Risk  of  Breast  Cancer 
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2:7.97-8.54  169  262  1.26(0.95-1.68)  1.02(0.72-1.45) 

3:8.54-8.92  151  273  1.08(0.81-1.44)  0.91(0.64-1.30) 

4:  >8.92 _ 146 _ 285 _ 1.00  (0.75-1.34)  0.83  (0.57-1.19)  0.33 


Table  16.  Traffic  PAH  Exposure  in  Different  Years  and  Risk  of  Breast  Cancer  (continue) 
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2:6.73-7.53  183  299  1.14(0.87-1.48)  1.05(0.76-1.45) 

3:7.53-7.90  176  285  1.15(0.88-1.50)  1.08(0.78-1.50) 

4:  >7.90  159  313  0.95  (0.72-1.24)  0.84  (0.60-1.18)  0.74 
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Task  3;  To  evaluate  genetic  susceptibility  in  relation  to  these  exposures  and  breast 
cancer  risk  by  examining  genetic  variability  in  metabolism  by  NOOl.  GSTMl.  GST 
Pi  and  CYPIAI. 


Blood  samples  are  at  Dr.  Peter  Shields’  laboratory  (Lombardi  Cancer  Center, 
Georgetown  University).  The  genotyping  analysis  for  GSTPi  and  NQOl  are  currently  in 
process.  Also,  data  analyses  are  still  underway,  thus  the  results  showed  here  are 
preliminary,  and  only  include  GSTMl  and  CYPIAI. 

PAH  Genetic  Susceptibility  and  Risk  of  Breast  Cancer 

PAHs  are  ubiquitous  and  occur  in  the  ambient  environment  at  low  levels.  Using 
classic  epidemiological  research  methods  to  detect  a  potentially  small  effect  on  breast 
cancer  may  be  difficult.  As  gene-environmental  interactions  play  a  more  and  more 
important  role  in  the  study  of  a  hypothesized  weak  association  between  environmental 
factors  and  cancer  risk,  more  studies  are  needed  to  detect  the  role  of  genetic 
polymorphisms  in  breast  cancer.  Polymorphisms  in  the  enzymes  involving  PAHs 
metabolism  (bioactivation  or  detoxification)  may  determine  the  individual’s  susceptibility 
to  the  formation  of  DNA  adduct.  A  number  of  papers  have  discussed  the  possible  effect 
of  CYPIAI,  GSTMl  and  GSTPI  on  the  risk  of  breast  cancer  in  relation  to  PAHs. 
CYPIAI  is  the  cytochrome  P450  isoenz)Tne  involved  in  the  activation  of  PAHs. 
Mutations  in  the  CYPIAI  gene  may  increase  the  susceptibility  of  individuals  to  DNA 
damage.  Glutathione  S-transferase  (GST)  is  involved  in  the  detoxification  of  PAHs  by 
catalyzing  conjugation  of  glutathione  with  diol  epoxides,  thus  preventing  the  formation  of 
the  PAH-DNA  adduct.  Among  many  classes  of  GST,  the  GSTMl  genetic  polymorphism 
is  a  deletion  of  the  entire  gene,  with  a  large  proportion  of  people  with  the  null  allele.  Lack 
of  GSTMl  enzyme  may  be  associated  with  a  higher  breast  cancer  risk.  GSTPI,  another 
class  of  GST,  may  also  play  an  important  role  in  the  breast  cancer  etiology.  One  study 
shows,  GSTMl  null  genotype  associated  with  increased  PAH-DNA  adducts  in  breast 
cancer  cases.  However,  the  role  of  these  polymorphisms  is  still  not  clear,  and  more 
epidemiological  studies  are  needed  to  test  or  confirm  the  effects  of  these  polymorphisms. 

Laboratory  Methods 

Blood  clots  for  participants  were  removed  from  the  freezer  and  shipped  on  dry  ice 
to  Dr.  Peter  Shields’  laboratory  (the  Lombardi  Cancer  Center,  Georgetown  Medical 
Center)  for  DNA  extraction  and  genotyping  analysis.  For  those  who  do  not  have  blood 
samples,  we  used  urine  and  saliva  samples  instead.  The  genes  that  are  examined  in  the 
study  include  GST  Ml-1,  GST  Pl-1  and  CYPIAI.  To  control  the  quality  of  these 
analyses,  a  positive  and  a  negative  control  were  included  for  every  20  samples.  We  have 
included  blind  duplicates  in  the  samples.  Case  and  control  status,  as  well  as  other 
characteristics  of  the  study  subjects,  are  not  known  by  the  technicians  who  perform  these 
tests.  These  analyses  are  currently  underway. 

Results 

GSTMl  null  genotype  was  not  associated  with  breast  cancer  risk  (Table  1-1), 
even  after  we  restricted  the  analysis  to  only  whites  (Table  1-2).  CYPIAI  mutant 
genotype  was  associated  with  decreased  risk  of  breast  cancer  (Table  4-1);  the  same 
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tendency  and  magnitude  held  even  after  we  restricted  the  analysis  to  only  whites,  but  no 
longer  statistically  significant  (Table  4-2). 

In  the  stratified  analyses  by  GSTMl  genotypes,  the  originally  observed 
association  between  PAH  exposure  and  increased  breast  cancer  risk  among  post¬ 
menopausal  at  first  birth  analyses  remained  only  in  GSTMl  null  genotype  (Table  2  and 
3).  Due  to  the  small  sample  size,  the  stratified  analyses  for  CYPl  A1  genotypes,  particular 
CYPl  A1  mutant  genotype,  was  not  able  to  conduct  (Table  5  and  6). 

Discussion 

Consistent  with  our  hypothesis  and  previous  studies,  our  data  suggested  that 
GSTMl  null  genotype  is  not  associated  with  breast  cancer  risk;  and  high  PAH  exposure 
was  only  associated  with  increased  breast  cancer  risk  among  post-menopausal  women 
with  GSTMl  null  genotype.  The  results  of  CYPlAl  and  breast  cancer  risk  were 
somewhat  unexpected,  and  these  preliminary  findings  are  currently  under  further 
investigation. 

As  for  most  of  studies  focusing  on  gene  and  environmental  interaction,  the  power 
of  this  study  is  limited  in  stratified  analyses  when.  Because  of  missing  residential  address 
in  earlier  life,  especially  at  birth,  the  power  of  this  study  is  limited  by  small  sample  size, 
preventing  us  from  examining  gene-gene  interactions.  The  metabolism  of  PAH  may  be 
affected  by  multiple  genes,  and  other  compounds  may  also  affect  the  enzyme  activity 
involving  in  PAH  regulation. 
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Table  1-1.  GSTM1  and  Risk  of  Breast  Cancer 
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Table  1-2.  GSTM1  and  Risk  of  Breast  Cancer  (whites  only) 
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Table  2.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (GSTM1  wt) 
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Table  3.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (GSTM1  null) 
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Table  4-1.  CYP1A1  and  Risk  of  Breast  Cancer 
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Table  4-2.  CYP1A1  and  Risk  of  Breast  Cancer  (whites  only) 
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Table  5.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (CYP1A1  wt) 
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*  Odds  ratios  and  95%  confidence  intervals  adjusted  for  age,  education,  race,  BMI  (1  yr  before  interview),  age  at 
first  birth,  age  at  menarche,  age  at  menopause  (for  post-menopausal  women  only),  number  of  births,  Brst-degree 
relative  with  breast  cancer,  previous  benign  breast  disease  and  year  at  interview. 
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Table  6.  Traffic  PAH  Exposure  in  Different  Time  Periods  and  Risk  of  Breast  Cancer  (CYP1A1  mutant) 
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CONCLUSION 


Overall,  our  research  activities  have  found  evidence  to  support  hypotheses  that 
early  life  exposure  to  environmental  pollutants  may  be  associated  with  breast  cancer  risk. 
Specifically,  we  found  evidence  of  geographic  clustering  of  residence;  the  premenopausal 
women  were  more  clustered  than  controls.  The  evidence  for  clustering  of  residential 
locations  at  birth  and  menarche  was  stronger  than  evidence  for  clustering  at  either  the 
time  of  women’s  first  birth  or  current  residence.  We  also  observed  a  general  tendency  of 
clustering  of  lifetime  residence.  Our  findings  suggest  that  there  may  be  identifiable 
etiological  processes  on  exposure  and  breast  cancer  risk,  and  that  early  exposures  may  be 
of  particular  importance. 

Furthermore,  we  examined  early  life  exposure  to  high  concentrations  of  total 
suspended  particulates,  a  surrogate  for  polycyclic  aromatic  hydrocarbons,  in  relation  to 
breast  cancer  risk.  We  observed  that  exposure  to  high  concentrations  of  total  suspended 
particulates  at  birth  was  associated  with  an  increase  in  risk  of  breast  cancer  for 
postmenopausal  women.  Conversely,  in  premenopausal  women,  the  results  were 
inconsistent  with  our  hypothesis  Early  life  exposure  to  environmental  tobacco  smoke  was 
suggestive  of  a  slight  increase  in  the  risk  of  breast  cancer;  however,  we  can  not  exclude 
the  possibility  that  exposure  was  unrelated  to  risk. 

Using  a  geographic  model  developed  and  validated  by  both  Long  Island  and  local 
data,  we  found  some  evidence  of  high  traffic  PAH  exposure  at  menarche  and  first  birth 
associated  with  increased  risk  of  breast  cancer;  and  this  increased  risk  was  only  seen  in 
non-smokers. 

Genetic  susceptibility  analyses  are  currently  in  process.  Our  preliminary  results 
suggest  that  GSTMl  null  genotype  is  not  associated  with  breast  cancer  risk;  and  high 
PAH  exposure  was  only  associated  with  increased  breast  cancer  risk  among  women  with 
GSTMl  null  genotype. 
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KEY  RESEARCH  ACCOMPLISHMENTS 


•  We  have  identified  and  completed  data  entry  for  relevant  industrial  sites  and 
major  roadways  during  time  periods  under  investigation. 

•  We  have  identified  additional  sources  of  information  regarding  historical  sources 
of  the  exposures  of  interest  and  their  locations  and  amounts. 

•  We  have  verified  and  geocoded  residential  histories  of  study  participants. 

•  We  have  completed  the  geocoding  for  study  participants  for  their  residence  at  the 
time  of  their  birth,  at  menarche,  when  they  had  a  first  birth  and  10  and  20  years 
before  diagnosis  (cases)  or  interview  (controls),  approximately  20,000  addresses 
in  Erie  and  Niagara  counties. 

•  We  have  conducted  a  validation  study  of  the  positional  accuracy  of  geocoded 
residences.  Results  of  this  validation  study  were  published  (1). 

•  We  completed  a  GIS-based  spatial  and  temporal  analysis  for  residences  of  breast 
cancer  cases  and  controls  at  early  life  and  found  strong  evidence  of  spatial 
clustering  for  cases  during  this  time.  A  manuscript  has  been  accepted  for 
publication  (2)  and  abstracts  have  been  presented  (7,  9,  10, 12, 14,  15). 

•  We  have  completed  data  analysis  examining  early  life  exposure  to  total  suspended 
particulates  and  exposure  to  environmental  tobacco  smoke  in  relation  to  risk  of 
breast  cancer  in  adult  life.  Two  abstracts  from  this  work  were  presented,  one  at 
the  annual  meeting  of  the  Society  for  Epidemiologic  Research,  the  other  at  the 
Annual  Meeting  of  the  Association  for  Cancer  Research  (6,  8).  Manuscripts  for 
this  have  been  submitted  for  publication  (3, 4). 

•  We  have  completed  data  analysis  examining  early  life  proximity  to  industrial  sites 
contracted  by  the  US  Atomic  Energy  Commission  in  relation  to  risk  of  breast 
cancer  in  adult  life.  An  abstract  was  presented  (16). 

•  DNA  extraction  for  approximately  3000  samples  is  close  to  completion  and 
assessment  of  genotypes  for  four  genes  is  underway. 

•  Doctoral  dissertation  (PhD),  “Environmental  Exposures  in  Early  Life  and  the  Risk 
of  Breast  Cancer,”  was  completed  April  3, 2003.  Dr.  Matthew  Bonner  is  now  a 
postdoctoral  fellow  in  Environmental  Epidemiology  at  NCI. 

•  Doctoral  dissertation  (PhD),  “Geographical  epidemiology  of  breast  cancer  in 
western  New  York:  Exploring  spatio-temporal  clustering  in  GIS,”  was  completed 
December  10, 2002.  Dr.  Daikwon  Han  is  working  on  a  postdoctoral  fellowship 
funded  by  the  USAMRMC. 

•  We  completed  updating  of  lifetime  residential  history  of  breast  cancer  cases  and 
controls  in  western  New  York;  all  Erie  and  Niagara  county  residential  location 
were  identified  and  geocoded,  and  these  were  merged  into  one  database.  We 
checked  consistency  of  geocoded  addresses  in  different  time  points  for  each 
individual,  updated  incomplete  addresses  using  Polk  searches,  and  validated  the 
consistency  of  reported  years  of  moved  in  and  out  of  the  residence. 

•  We  completed  additional  validation  of  the  traffic  model  and  further  calibrated  the 
model  parameters.  This  traffic  model,  used  to  reconstruct  historical  traffic  PAH 
exposure,  was  originally  developed  from  the  Long  Island  Breast  Cancer  Project. 
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We  completed  data  analysis  examining  traffic  PAH  exposure  and  risk  of  breast 
cancer  in  menarche,  year  of  first  birth,  and  year  of  exposure.  A  manuscript  is 
being  prepared. 

We  completed  preliminary  data  analysis  examining  PAH  metabolic 
polymorphisms,  GSTMl  and  CYPlAl,  in  relation  to  breast  cancer  risk  and  gene 
environmental  interaction.  A  manuscript  is  being  prepared. 

A  manuscript  is  in  preparation  regarding  the  clustering  of  lifetime  residence  and 
breast  cancer  risk  using  exploratory  spatial  analysis  tools  based  on  these  lifetime 
residential  history  data. 

An  abstract  will  be  presented  at  the  annual  meeting  of  the  International  Society 
for  Environmental  Epidemiology  in  New  York  City,  New  York,  August  2004. 

A  paper  was  a  semi-finalist  in  the  Nystrom  dissertation  competition  of  the 
Association  of  American  Geographers  (AAG),  and  the  paper  was  presented  at  the 
centennial  meeting  of  the  AAG,  Philadelphia,  PA.  March  2004  (14). 

Doctoral  dissertation  (PhD),  "Environmental  Exposure  to  Polycyclic  Aromatic 
Hydrocarbons  (PAHs)  Genetic  Susceptibility  and  Risk  of  Breast  Cancer.  To  be 
completed  by  September  1, 2004. 
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Positional  Accuracy  of  Geocoded  Addresses 
in  Epidemiologic  Research 

Matthew  R.  Bonner,  *  Daikwon  Han,  f  Jing  Nie,  *  Peter  Rogerson,  f  John  E.  Vena,  *  and 

Jo  L.  Freudenheim* 


Background:  Geographic  information  systems  (GIS)  offer  power- 
fill  techniques  for  epidemiologists.  Geocoding  is  an  important  step 
in  the  use  of  GIS  in  epidemiologic  research,  and  the  validity  of 
epidemiologic  studies  using  this  methodology  depends,  in  part,  on 
the  positional  accuracy  of  the  geocoding  process. 

Methods:  We  conducted  a  study  comparing  the  validity  of  posi¬ 
tions  geocoded  with  a  commercially  available  programme  positions 
determined  by  Global  Positioning  System  (GPS)  satellite  receivers. 
Addresses  (N  =  200)  were  randomly  selected  from  a  recently 
completed  case-control  study  in  Western  New  York  State.  We 
geocoded  addresses  using  Arc  View  3.2  on  the  GDT  Dynamap/2000 
U.S.  Street  database.  In  addition,  we  measured  the  longitude  and 
latitude  of  these  addresses  with  a  GPS  receiver.  The  distance 
between  the  locations  obtained  by  these  two  methods  was  calculated 
for  all  addresses. 

Results:  The  distance  between  the  geocoded  point  and  the  GPS 
point  was  within  100  m  for  the  majority  of  subject  addresses  (79%), 
with  only  a  small  proportion  (3%)  having  a  distance  greater  than 
800  m.  The  overall  median  distance  between  GPS  points  and 
geocoded  points  was  38  m  (90%  confidence  interval  [Cl]  =  34-46). 
Distances  were  not  different  for  cases  and  controls.  Urban  addresses 
(median  =  32  m;  Cl  =  28-37)  were  slightly  more  accurate  than 
nonurban  addresses  (median  =  52m;CI  =  44-61). 

Conclusions.  This  study  indicates  that  the  suitability  of  geocoding 
for  epidemiologic  research  depends  on  the  level  of  spatial  resolution 
required  to  assess  exposure.  Although  sources  of  error  in  positional 
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accuracy  for  geocoded  addresses  exist,  geocoding  of  addresses  is, 
for  the  most  part,  very  accurate. 

Key  Words:  geographic  information  systems,  geocoding,  address 
matching,  epidemiology 

(Epidemirdogy  2003;  14:  408-412) 

Geographic  information  systems  (GIS)  are  increasingly 
used  by  epidemiologists  to  screen  and  test  hypotheses 
about  environmental  exposures  and  disease. GIS  tech¬ 
niques  lend  themselves  to  assessing  residential  or  occupa¬ 
tional  proximity  to  exposures  and  to  assessing  spatial  varia¬ 
tion  in  epidemiologic  measures.  An  important  first  step  is 
often  to  geocode  the  study  participant  addresses."^  Geocoding, 
also  referred  to  as  address  matching,  is  the  process  whereby 
the  relative  positions  of  addresses  are  linked  to  a  reference 
theme,  which  is  a  database  that  contains  both  address  infor¬ 
mation  and  locational  information  (ie,  latitude  and  longitude). 
A  reference  theme  for  geocoding,  therefore,  can  be  consid¬ 
ered  an  electronic  version  of  a  street  map.  Geocoding  is  an 
attractive  method  for  epidemiologists  because  GIS  software 
is  relatively  inexpensive,  uses  routinely  collected  address 
data,  and  is  very  efficient  at  locating  addresses.  Verification 
of  each  subject’s  address  location  by  other  methods  would 
require  considerable  time  and  resources,  especially  for  a  large 
study.  Furthermore,  verifying  address  locations  is  not  feasible 
when  subjects’  lifetime  series  of  addresses  are  considered 
because  the  number  of  these  addresses  can  become  extremely 
large. 

The  validity  of  epidemiologic  studies  using  GIS  and 
geocoding  methods  depends  on  the  proportion  of  addresses 
that  can  be  geocoded  as  well  as  the  positional  accuracy  of  the 
geocoding  process.  Several  studies  have  assessed  the  address 
matching  rate  of  commercial  geocoding  companies,  and 
found  that  matching  rates  are  typically  60-80%.^'^  No  pre¬ 
vious  published  studies  have  assessed  the  positional  accuracy 
of  geocoding  in  epidemiologic  research.  Positional  inaccu¬ 
racy  of  geocoded  addresses  may  be  an  important  source  of 
exposure  misclassification  in  environmental  epidemiology. 
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We  describe  here  a  study  comparing  the  location  of 
addresses  measured  by  global  positioning  system  (GPS)  re¬ 
ceivers  (devices  that  use  satellite  signals  to  estimate  the 
latitude  and  longitude  of  any  given  location)  to  positions 
geocoded  with  a  commercially  available  reference  theme.  We 
assessed  three  areas  that  may  be  important  to  determine  the 
appropriateness  of  geocoding  in  epidemiologic  research. 
First,  we  compared  the  positional  accuracy  of  historical 
addresses.  Many  exposures  that  are  relevant  to  disease  out¬ 
comes  are  historical  in  nature  and  positional  inaccuracy  of 
geocoding  historical  addresses  may  be  a  source  of  error  in 
estimating  these  historical  exposures.  Second,  we  investi¬ 
gated  whether  positional  inaccuracy  of  geocoded  addresses 
would  result  in  differential  exposure  misclassification  be¬ 
tween  cases  and  controls.  Third,  we  compared  differences  in 
positional  accuracy  between  urban  and  nonurban  areas.  This 
urban-rural  differential  is  particularly  important  because  the 
reference  themes  commonly  available  were  designed  for  non- 
epidemiologic  purposes  and  are  generally  thought  to  be  more  _ 
accurate  and  complete  in  urban  areas  than  in  nonurban  areas. 

METHODS 

We  obtained  a  random  sample  of  200  addresses  from  a 
recently  completed  case— control  study  in  Erie  and  Niagara 
Counties  in  Western  New  York  State.  Lifetime  residential 
histories  were  collected  from  3,286  subjects  for  a  total  of 
20,240  addresses.  These  included  study  participants’  current 
addresses  and  all  previous  addresses  dating  back  to  1918.  For 
the  remainder  of  this  article,  addresses  before  the  current 
address  of  each  participant  are  termed  “historical  addresses.” 
Most  addresses  (N  =  15,903)  were  for  residences  in  Erie  and 
Niagara  Counties.  We  geocoded  Erie  and  Niagara  County 
addresses  using  Arc  View  3.2  (ESRI,  Inc.,  Redlands,  CA)  and 
the  Dynamap/2000  US  Street  Database  (Geographic  Data 
Technologies,  Inc.,  Lebanon,  NH)  for  Erie  and  Niagara 
Counties  as  the  reference  theme.  Essentially,  the  Dynamap/ 
2000  is  an  enhancement  of  the  Topologically  Integrated 
Geographic  Encoding  and  Referencing  file  (TIGER/line  file) 
that  was  developed  by  the  US  Bureau  of  the  Census.  These 
are  data  files  that  contain  street  address  ranges  and  census 
tract/block  boundaries.^  We  matched  10,356  (65%)  of  the 
original  15,903  addresses  using  the  initial  geocoding  process. 

To  ensure  an  adequate  number  of  cases  and  controls  for 
comparison  of  urban  and  nonurban  positional  accuracy,  we 
randomly  selected  200  addresses  in  a  random  block  selection 
scheme  to  obtain  50  cases  and  50  controls  from  urban  areas 
and  50  cases  and  50  controls  from  nonurban  areas.  We 
defined  urban  addresses  as  addresses  within  the  city  limits  of 
Buffalo,  Niagara  Falls,  and  Kenmore,  NY.  All  other  ad¬ 
dresses  were  considered  nonurban.  If  an  address  could  not  be 
located  for  the  GPS  measurements,  then  that  address  was 
discarded  and  a  new  address  was  randomly  selected  from  the 
same  block. 


We  determined  the  geocoded  latitude  and  longitude  for 
each  address  with  ArcView  3.2  by  first  geocoding  the  ad¬ 
dresses  and  then  changing  the  map  projection  to  Universal 
Transverse  Mercator- 1983.  This  projection  compensates  for 
the  Earth’s  curvature  and  generates  more  precise  estimates  of 
latitude  and  longitude  than  other  projections.  Latitude  and 
longitude  were  then  converted  into  x  and  y  coordinates 
(arbitrary  values  representing  a  point  on  a  plane)  for  each 
address.  These  x  and  y  coordinates  were  measured  in  meters 
for  this  study. 

We  measured  the  actual  geographic  position  of  the  200 
addresses  with  an  Etrex  GPS  receiver  manufactured  by 
Garmin  (Garmin  International,  Inc.,  Olathe,  KS).  This  GPS 
receiver  reported  latitude  and  longitude  in  decimal  degrees  to 
five  places  using  the  World  Geodetic  System  1984  map 
datum.  Before  making  site  visits,  the  GPS  receiver  was  turned 
on  and  automatically  searched  for  least  three  satellite  signals. 
Once  the  satellite  signals  were  detected,  the  GPS  receiver 
provides  real-time  current  latitude,  longitude,  speed,  and 
direction.  Investigators  then  visited  each  address  and  used  the 
GPS  receiver  to  record  latitude  and  longitude  from  the  street 
directly  in  front  of  each  address. 

The  observed  GPS  latitudes  and  longitudes  were  then 
converted  to  x  and  y  coordinates  in  Universal  Transverse 
Mercator-1983  projection  for  comparison  between  the  GPS 
and  the  geocoded  positions.  We  calculated  the  distance  be¬ 
tween  the  two  points  for  each  address  by  determining  the 
Euclidean  length  of  the  hypotenuse  of  the  right  triangle 
formed  by  the  two  points:  [(xj  -  ^2)^  +  (yi  “  yi)  where 
Xj  is  the  GPS  latitude,  X2  is  the  geocoded  latitude,  y^  is  the 
GPS  longitude,  and  72  is  the  geocoded  longitude.  This  for¬ 
mula  does  not  correct  for  the  curvature  of  the  Earth.  How¬ 
ever,  this  uncorrected  formula  does  not  introduce  sizable 
error  in  the  distance  calculation  between  the  two  points 
because  the  distances  between  the  points  were  relatively 
small. 

The  mean  distance,  its  standard  deviation,  and  the 
median  distance  between  the  geocoded  points  and  the  GPS 
points  were  calculated  with  SPSS  version  10.1  for  the  total 
sample,  for  case  and  control  addresses,  for  urban  and  nonur¬ 
ban  addresses,  and  for  cases  and  controls  stratified  by  urban/ 
nonurban  status.  We  grouped  distance  into  nine  categories: 
<25  m,  25-50  m,  51-75  m,  76-99  m,  100-199  m,  200-399 
m,  400-599  m,  600-799  m,  and  >800  m;  the  proportion  of 
addresses  in  each  of  these  categories  was  then  computed. 
Bootstrapped  90%  confidence  intervals  for  the  medians  of 
distance  were  computed  with  Resampling  Stats  (Resampling 
Stats,  Inc.,  Arlington,  VA). 

RESULTS 

The  majority  of  the  200  randomly  selected  addresses 
(N  =  133)  were  historical  in  nature;  subjects  did  not  currently 
occupy  these  addresses  (Table  1).  The  median  distances 
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TABLE  1.  Distance  (m)  Between  Geocoded  Position  and  GPS  Position  for  200 
Addresses,  by  the  Year  Moved  Out  of  Address 


Year  Moved  Out  of 
Address* 

%  of  Addresses 
(N  =  200) 

Median  Distance 
(90%  Cl) 

Minimum 

Distance 

Maximum 

Distance 

1930-1939 

1.5 

33  (16-50) 

16 

1940-1949 

8.0 

40  (32-53) 

17 

1225 

1950-1959 

11.5 

38  (26-53) 

9 

1960-1969 

11.5 

29  (26-38) 

9 

1970-1979 

14.5 

36  (29-71) 

7 

2552 

1980-1989 

12.0 

38  (28-59) 

5 

313 

1990-2000 

7.5 

34  (28-70) 

12 

Currently  occupy 

32.5 

49  (38-59) 

6 

Unknown 

1,0 

96(19-172) 

19 

172 

Total 

100 

38  (34-46) 

5 

2552 

GPS  =  global  positioning  system. 

*Indicates  the  decade  when  a  study  participant  moved  out  of  the  address. 


between  the  GPS  and  the  geocoded  position  did  not  vary 
greatly  across  decades.  For  the  current  addresses,  there  was  a 
slightly  larger  median  distance  (49  m;  90%  confidence  inter¬ 
val  [Cl]  =  38-59)  between  the  measured  address  location 
and  the  geocoded  location,  whereas  the  distances  for  the 
historical  addresses  tended  to  be  between  30  and  40 
However,  the  three  addresses  with  distances  greater  than 
1,000  m  were  all  historical  addresses. 

Tables  2  and  3  present  comparisons  of  the  geocoded 


TABLE  2.  Distance  Between  Geocoded  Position  and  GPS 
Position  for  Case  and  Control  Addresses 


Cases 
(N  =  100) 

Controls 
(N  =  100) 

AH  Addresses 
(N  =  200) 

Distance  (m) 

Median 

41 

38 

38 

90%  Cl 

31-47 

33-49 

34-46 

Mean  (SD) 

119(247) 

107  (286) 

113  (266) 

Minimum  distance 

5 

5 

5 

Maximum  distance 

1151 

2552 

2552 

Distance  (%) 

<25  m 

28 

26 

27 

25-50  m 

33 

34 

33.5 

51-75  m 

12 

16 

14 

76-99  m 

5 

4 

4.5 

100-199  m 

10 

9 

9.5 

200-399  m 

5 

8 

6.5 

400-599  m 

1 

1 

1 

600-799  m 

2 

0 

1 

>800  m 

4 

2 

3 

GPS  =  global  positioning  system;  SD  =  standard  deviation. 


and  GPS  positions.  The  distribution  of  distances  between 
geocoded  and  GPS  positions  is  skewed  to  the  right,  as 
evidenced  by  the  median  distance  for  all  addresses  (38  m; 
90%  Cl  =  34-46)  being  considerably  smaller  than  the  mean 
distance  (1 13  m;  Table  2).  Consequently,  the  median  is  more 
accurate  than  the  mean  as  a  measure  of  central  tendency.  The 
distance  between  the  geocoded  point  and  the  GPS  point  was 
within  100  m  for  the  majority  of  the  all  subject  addresses 
(79%),  with  only  a  small  proportion  (3%)  having  a  distance 
greater  than  800  m.  Distances  were  not  different  for  cases  and 
controls. 

Positional  accuracy  was  somewhat  better  for  the  urban 
addresses  (32  m;  90%  Cl  =  28-37)  than  for  the  nonurban 
addresses  (52  m;  90%  Cl  =  44-61)  (Table  3).  In  addition  to 
having  a  smaller  median  distance,  urban  addresses  had  a 
higher  proportion  of  addresses  within  100  m  (89%)  than  the 
nonurban  addresses  (69%).  Within  the  urban  strata,  cases  and 
controls  were  more  similar  than  the  cases  and  controls  in  the 
nonurban  strata.  In  the  nonurban  strata,  there  was  a  9-m 
difference  in  the  medians  between  cases  (45  m;  90%  Cl  = 
40-70)  and  controls  (54  m;  90%  Cl  =  38-66).  In  addition, 
urban  cases  (86%)  and  controls  (92%)  had  a  higher  propor¬ 
tion  of  addresses  within  100  m  than  did  nonurban  cases 
(70%)  and  controls  (68%). 

DISCUSSION 

Overall,  the  positional  accuracy  of  addresses  geocoded 
with  the  Dynamap/2000  was  good.  The  majority  of  addresses 
were  located  within  100  m  of  the  real  address  as  determined 
by  on-site  GPS  latitude  and  longitude  measurements;  the 
historical  addresses  tended  to  have  smaller  median  distances 
than  the  current  addresses.  There  was,  however,  some  diffi¬ 
culty  in  assessing  all  the  selected  historical  addresses.  In  eight 
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TABLE  3.  Distance  Between  Geocoded  Position  and  GPS  Position  for  Case  and  Control  Addresses,  by  Urban/Nonurban 
Residential  Status 

Urban 

Non-Urban 

Cases 
(N  =  50) 

Controls 
(N  =  50) 

Total 
(N  =  100) 

Cases 
(N  =  50) 

Controls 
(N  =  50) 

Total 
(N  =  100) 

Distance  (m) 

Median 

31 

32 

32 

45 

54 

:)Z 

90%  Cl 

27-41 

28-37 

28-37 

40-70 

38-66 

44-61 

Mean  (SD) 

122  (285) 

70  (177) 

96  (237) 

116(204) 

125  (362) 

129  (293) 

Minimum  distance 

6 

5 

5 

5 

12 

5 

Maximum  distance 

1551 

1225 

1551 

1223 

2552 

2552 

Distance  (%) 

<25  m 

36 

30 

33 

20 

22 

21 

26-50  m 

34 

46 

40 

32 

22 

27 

51-75  m 

14 

12 

13 

10 

20 

15 

76-99  m 

2 

4 

3 

8 

4 

_  ^ 

100-199  m 

4 

2 

3 

16 

16 

16 

200-399  m 

0 

4 

2 

10 

12 

11 

400-599  m 

2 

0 

1 

0 

20 

10 

600-799  m 

2 

0 

1 

2 

00 

10 

>800  m 

6 

2 

4 

20 

20 

20 

GPS  =  global  positioning  system. 

instances,  subjects’  homes  appeared  to  have  been  demol¬ 
ished,  leaving  a  vacant  lot.  The  uncertainty  about  whether 
these  lots  were  the  correct  street  address  prevented  GPS 
measurements  of  these  addresses.  This  indicates  that  in  areas 
where  there  has  been  major  redevelopment  or  neglect  the 
positional  accuracy  of  historical  addresses  may  be  more 
difficult  to  determine. 

Positional  accuracy  was  not  different  between  cases 
and  controls,  suggesting  that  errors  resulting  from  the  geoc¬ 
oding  of  addresses  may  not  result  in  differential  misclassifi- 
cation  of  exposure.  In  addition,  the  difference  in  positional 
accuracy  between  urban  addresses  and  nonurban  addresses 
was  small.  However,  even  with  good  overall  positional  ac¬ 
curacy,  there  were  several  sources  of  error.  First,  the  Dy- 
namap/2000  is  largely  derived  from  the  US  Bureau  of  the 
Census  TIGEIUline  files,  and  inaccurate  methods  were  used 
to  create  these  TIGER/line  files.  The  Geography  Division  of 
the  US  Bureau  of  the  Census  has  reported  that  the  median 
distances  between  GPS  measured  positions  and  the  TIGER/ 
line  file  positions  in  eight  US  counties  ranged  between  30  and 
121  m.^  Additionally,  this  report  also  indicated  that  TIGER/ 
line  file  updates  since  1990  are  less  accurate  than  the  pre- 
1990  versions.  These  inaccuracies  may  have  important  im¬ 
plications  for  epidemiologic  use  because,  as  the  updates  to 
the  TIGER/line  files  provide  more  complete  coverage  and 
increase  the  address-matching  rate,  the  decrease  in  positional 
accuracy  may  lead  to  increased  error  and  misclassification  of 


exposure  when  estimating  exposures  based  on  geographic 
positioning. 

The  second  source  of  error  with  geocoding  arises  from 
the  geocoding  process  itself.  We  found  that  positional  accu¬ 
racy  was  slightly  decreased  in  nonurban  addresses  compared 
with  urban  addresses,  likely  a  result  of  the  geocoding  process. 
Geocoding  uses  interpolation  to  estimate  the  relative  position 
of  an  address  on  a  line  segment  in  the  reference  theme."^’^  The 
likelihood  of  inaccurate  interpolation  by  the  geocoding  pro¬ 
cess  is  higher  for  an  address  location  in  areas  with  long  street 
segments  than  in  areas  where  there  are  many  short  street 
segments,  regardless  of  urban/nonurban  location;  however, 
nonurban  areas  generally  have  longer  street  segments  than 
urban  areas. 

In  addition  to  error  in  positional  accuracy  from  the 
Dynamap/2000  and  geocoding,  GPS  receivers  are  also  prone 
to  error.  These  errors  are  generally  small  but  remain  a 
limitation  of  this  study  in  that  the  GPS  receiver  was  used  as 
the  standard.  GPS  errors  in  positional  accuracy  arise  from 
three  general  sources:  satellite  related  errors,  signal  propaga¬ 
tion  errors,  and  receiver  errors.  Garmin  reports  that  the  Etrex 
has  positional  accuracy  between  1  and  5m.  Field  tests,  how¬ 
ever,  indicate  that  civilian  GPS  receivers  are  only  accurate 
within  15-40  m.^  The  Dynamap/2000  may  actually  be  more 
accurate  than  we  report  here  because  most  of  the  distances 
between  the  GPS  point  and  the  geocoded  point  were  within 
the  range  of  error  for  the  GPS  unit. 
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These  errors  in  the  positional  accuracy  of  geocoding 
study  participant  addresses  and  sources  of  exposure  have 
several  implications  for  epidemiologic  research.  First,  numer¬ 
ous  epidemiologic  studies  have  crudely  defined  exposure 
based  solely  on  proximity  of  a  residential  address  to  an 
exposure  of  interest.  For  instance,  McLaughlin  and  associ¬ 
ates^®  used  a  25-km  radius  around  nuclear  facilities  in  Canada 
to  classify  those  residing  within  that  circle  as  exposed  and 
those  outside  as  unexposed.  Clearly,  geocoding  has  sufficient 
spatial  resolution  to  distinguish  differences  on  this  scale.  In 
another  study,  Croen  et  al*^  investigated  maternal  residential 
proximity  to  hazardous  waste  sites  and  congenital  malforma¬ 
tions.  Maternal  residence  within  1  mile  (1.6  km)  of  a  hazard¬ 
ous  waste  site  was  defined  as  exposed.  Again,  even  with  the 
higher  required  spatial  resolution  to  define  exposure,  geo¬ 
coding  should  have  sufficient  positional  accuracy  to  classify 
exposure  appropriately. 

There  are  exposures,  however,  with  great  spatial  vari¬ 
ation  over  relatively  small  distances.  When  these  exposures 
are  being  considered,  the  study  may  require  more  precise 
methods  to  locate  subjects  and  exposure  sources  to  produce 
valid  risk  estimates.  Electromagnetic  fields,  for  instance, 
require  high  spatial  resolution  to  estimate  exposure  accu¬ 
rately.  The  intensity  of  magnetic  fields  decreases  exponen¬ 
tially  with  distance.  In  many  epidemiologic  studies  that 
investigate  electromagnetic  fields  and  cancer,  residential 
proximity  to  power  lines  and  electricity  transmission  equip¬ 
ment  has  been  measured  in  meters  rather  than  kilometers  or 
miles  as  in  the  previous  examples.  For  example,  in  their 
meta-analysis  of  residential  proximity  to  electricity  transmis¬ 
sion  and  distribution  equipment  and  childhood  cancer,  Wash- 
bum  et  al*^  used  less  than  50  m  to  define  the  exposed  group. 
Based  on  the  present  validation  of  geocoding,  there  is  some 
question  whether  the  positional  accuracy  of  geocoding  is 
sufficient.  Use  of  geocoding  in  situations  where  high  spatial 
resolution  is  required  may  lead  to  extensive  nondifferential 
misclassification  of  exposure,  thereby  greatly  reducing  the 
validity  of  the  risk  estimates. 

The  generalizability  of  this  study  may  be  limited  by 
regional  variation  in  the  completeness  of  the  Dynamap/2000. 
These  results  should  be  comparable  to  regions  where  the 
Dynamap/2000  has  completeness  similar  to  that  of  Erie  and 
Niagara  Counties.  Furthermore,  as  the  US  Bureau  of  the 
Census  and  GDT  continue  to  develop  and  improve  the 
TIGER/line  and  the  Dynamap/2000  databases,  regional  vari¬ 
ation  will  diminish.  However,  recent  improvements  in  the 


completeness  of  these  updated  street  databases  may  not 
alleviate  the  concerns  about  their  positional  accuracy  because 
post- 1990  updates  have  not  been  as  carefully  assembled  as 
the  pre-1990  updates. 

Finally,  as  GIS  becomes  more  commonly  used  in  epi¬ 
demiologic  research,  the  need  to  assess  geocoding  methods 
and  reference  themes  will  become  more  important,  especially 
when  high  spatial  resolution  is  required  to  classify  a  study 
subject’s  exposure  accurately.  Overall,  this  study  indicates 
that  these  tools  are  sufficiently  accurate  for  some  -but  not  all- 
epidemiologic  studies.  Consequently,  care  should  be  taken  in 
the  interpretation  of  results,  taking  into  account  sources  of 
error  in  positional  accuracy  for  geocoded  addresses  that  may 
affect  exposure  classification. 
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Abstract 


Objective:  This  study  focused  on  geographic  clustering  of  breast  cancer  based  on 
residence  in  early  life  and  identified  spatio-temporal  clustering  of  cases  and  controls. 
Methods:  Data  were  drawn  from  the  WEB  study  (Western  New  York  Exposures  and 
Breast  Cancer  Study),  a  population-based  case-control  study  of  incident,  pathologically 
confirmed  breast  cancer  (1996-2001)  in  Erie  and  Niagara  counties.  Controls  were 
frequency-matched  to  cases  on  age,  race,  and  county  of  residence.  All  cases  and  controls 
used  in  the  study  provided  lifetime  residential  histories.  The  ^-function  difference 
between  cases  and  controls  was  used  to  identify  spatial  clustering  patterns  of  residence 
in  early  life. 

Results:  We  found  that  the  evidence  for  clustered  residences  at  birth  and  at  menarche 
was  stronger  than  that  for  first  birth  or  other  time  periods  in  adult  life.  Residences  for 
premenopausal  cases  were  more  clustered  than  for  controls  at  the  time  of  birth  and 
menarche.  We  also  identified  the  size  and  geographic  location  of  birth  and  menarche 
clusters  in  the  study  area,  and  found  increased  breast  cancer  risk  for  pre-menopausal 
women  whose  residence  was  within  the  cluster  compared  to  those  living  elsewhere  at 
the  time  of  birth. 

Conclusion:  This  study  provides  evidence  that  early  environmental  exposures  may  be 
related  to  breast  cancer  risk,  especially  for  pre-menopausal  women. 

Key  words:  spatial  clustering,  breast  cancer,  early-life  exposure 
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Introduction 
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Breast  cancer  is  one  of  the  leading  causes  of  death  among  women  in  the  United 
States.  However,  the  epidemiology  of  breast  cancer  is  not  yet  fully  understood.  We  also 
do  not  fully  understand  mechanisms  for  the  known  risk  factors;  for  instance,  why  changes 
in  age  at  menarche  or  age  at  first  birth  have  an  impact  on  breast  cancer  risk.  A  substantial 
degree  of  geographical  variation  in  breast  cancer  incidence  and  mortality  in  the  US  has 
been  observed.*’^  While  inconclusive,  several  environmental  risk  factors  are  also  believed 
to  be  involved  in  breast  cancer  incidence.^’'*  There  is  speculation  that  environmental 
factors  may  explain  geographic  variation  in  breast  cancer  rates  not  explained  by  known 
risk  factors.  For  this  reason,  the  potential  role  of  environmental  exposures  in  breast 
cancer  risk  is  of  particular  interest. 

In  addition,  there  is  a  growing  interest  in  early  life  and  lifetime  exposures  in 
relation  to  breast  cancer  risk.  The  life  course  approach  is  of  interest  because  there  may  be 
sensitive  time  periods  for  exposures  and/or  there  may  be  cumulative  effects  of  lifetime 
exposure  involved  in  breast  cancer  incidence.^’^  Early  life  has  an  effect  on  breast  cancer 
etiology  evidenced  by  the  known  risk  factors  such  as  age  at  menarche,  age  at  first  birth 
and  parity.  There  is  new  evidence  that  even  earlier  exposures  may  have  an  impact  on 
adult  breast  cancer  risk.’  Trichopoulos^  suggested  that  the  in-utero  and  perinatal  period 
might  be  pathologically  significant  and  that  the  risk  of  adult  breast  cancer  could  be 
related  to  high  estrogen  exposure  in  early  life.  There  is  also  accumulating  evidenpe  that 
factors  related  to  early  exposure,  such  as  birthweight,  may  be  related  to  risk.^’^® 

There  has  been  little  research  investigating  possible  effects  of  environmental 
exposures  in  early  life  on  subsequent  breast  cancer  risk.  Using  residence  as  a  proxy 
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measure  for  environmental  exposures,  we  investigated  whether  there  was  any  evidence  of 
geographic  clustering  of  adult  breast  cancer  cases  associated  with  their  residences  in  early 
life.  Clustering  analyses  have  often  been  used  to  provide  clues  for  the  unknown  etiology 
of  disease,  and  thus  to  generate  hypotheses  for  further  epidemiologic  research.**  We 
looked  at  the  geographic  clustering  of  residence  at  early  critical  time  points;  at  birth,  at 
menarche,  and  at  the  woman’s  first  birth.  By  comparing  differences  in  clustering  patterns 
between  case  and  control  residences,  we  were  interested  in  identifying  time  periods 
critical  to  potential  environmental  exposures  and  subsequent  breast  cancer  risk. 


Methods 

Population-based  case-control  study  of  breast  cancer 

We  conducted  a  case-control  study  of  breast  cancer  in  western  New  York  ~  the  WEB 
study  (Western  New  York  Exposures  and  Breast  Cancer  Study).  Cases  were  women,  age 
35-79  with  incident,  primary,  pathologically  confirmed  breast  cancer  diagnosed  in  Erie  and 
Niagara  counties  during  the  period  1996-2001,  with  no  previous  cancer  diagnosis  other 
than  non-melanoma  skin  cancer.  Controls  were  frequency  matched  to  cases  on  age,  race, 
and  county  of  current  residence;  controls  under  65  years  of  age  were  randomly  selected 
from  a  New  York  State  Department  of  Motor  Vehicles  list  and  those  65  years  and  over 
were  chosen  from  a  Health  Care  Finance  Administration  list.  We  ascertained  cases  by 
having  a  nurse-case  finder  visit  the  pathology  departments  of  almost  all  hospitals  in  these 
counties.  One  hospital  which  did  not  participate  does  almost  no  cancer  surgery  and  refers 
patients  to  other  participating  hospitals.  For  the  one  other  hospital  which  did  not 
participate,  breast  cancer  cases  were  identified  in  the  practice  of  the  breast  surgeons  who 
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see  more  than  99%  of  the  cases  from  that  hospital.  Extensive  in-person  interviews  and 
self-administered  questionnaires  were  used  to  ascertain  lifetime  residential  history  and 
other  breast  cancer  risk  factors.  A  total  of  1 166  cases  and  2105  controls  were  interviewed. 
Response  rates  were  72%  and  65%  for  cases  and  controls,  respectively. 

All  participants  were  asked  to  complete  a  lifetime  residential  history,  to  list  the 
street  address,  town/city  and  zip  code  for  their  current  address  and  then  all  other  previous 
addresses  throughout  their  lifetime.  Participants  provided  20,240  addresses,  an  average  of 
approximately  six  addresses  for  each  individual.  In  this  study  we  focused  on  residence  at 
the  time  of  the  participants’  birth,  menarche,  and  at  the  time  that  she  had  her  first  birth. 
Analyses  were  restricted  to  women  residing  in  Erie  or  Niagara  counties  at  each  of  these 
time  points.  There  were,  of  course,  participants  whose  addresses  were  the  same  for  two  or 
more  of  these  times. 

For  women  with  incomplete  residential  information,  additional  information  was 
obtained  using  historical  city  directories.  We  used  these  directories  to  find  old  addresses, 
and  utilized  various  resources,  such  as  web  searches  and  commercial  address  databases 
for  recent  addresses.  We  also  examined  validity  and  reliability  of  reports  of  earlier 
residences  in  a  number  of  ways.  For  birth  addresses,  we  asked  for  information  on  birth 
address  twice  and  have  collected  information  on  reliability  of  response.  For  the  other  time 
periods,  we  used  information  on  maiden  name  and  partial  address  information  provided 
by  the  participants  to  search  for  records  in  city  directories  for  the  appropriate  time 
periods.  To  improve  our  ability  to  geocode  addresses,  we  developed  several  strategies. 
First,  all  addresses  were  standardized  to  be  matched  with  the  standard  format  used  in  GIS. 
We  used  the  enhanced  version  of  TIGER  (Topologically  Integrated  Geographic  Encoding 
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and  Referencing  Systems),  GDT/Dynamap  2000*^,  and  overall  matching  rates  were 
improved  about  15-20  %  when  compared  with  the  use  of  TIGER  as  a  reference  theme. 

We  also  used  the  stand-alone  address  cleaner  ZP4  (Semaphore  Co.)  to  correct  and  update 
zip  code  information  to  be  matched  with  United  States  Postal  Services  certified  addresses. 

More  than  85%  of  addresses  were  geocoded  using  the  above  strategies  and 
resources.  We  failed  to  geocode  some  addresses  primarily  because  of  missing  residential 
information,  such  as  missing  street  numbers  or  street  names.  Since  we  are  dealing  with 
historical  residential  information,  the  likelihood  of  missing  previous  residential 
information  was  higher  than  that  for  current  residential  information.  Table  1  is  a 
summary  table  showing  the  numbers  of  cases  and  controls  with  complete  residential 
information  who  resided  in  the  two  counties  for  each  of  the  time  periods.  The  percentage 
of  missing  residential  information  associated  with  each  early  life  event  was  highest  for 
birth  addresses,  at  about  20%. 

Clustering  analyses  of  residences 

To  compare  clustering  patterns  of  breast  cancer  cases  and  controls  at  each  time 
period,  the  primary  method  used  was  based  on  the  A:-function.*^  The  A:-function  for  a  point 
process  is  defined  as  the  number  of  events  within  distance  h  of  an  arbitrary  event,  divided 
by  the  overall  intensity  of  events.  It  is  estimated  by 

X  k{h)  =  X  2]  ^h)ln,  /z  >  0 

(=1  7=1 

where  n  is  the  number  of  events,  X  is  the  expected  density  of  events  in  the  study  region,  h 
is  the  pre-specified  distance,  dy  is  the  Euclidian  distance  between  point  i  and  point  j,  I  is 
an  indicator  function  that  is  equal  to  one  if  inter-event  distances  {dy)  are  less  than  or  equal 
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to  h,  and  zero  otherwise,  and  w(si,  Sj)  is  an  edge  correction  estimator  which  is  the 
proportion  of  the  circumference  of  a  circle  centered  at  5„  passing  through  sj  and  that  is 
inside  the  study  area^.*'^  Under  the  null  hypothesis  of  spatial  randomness,  the  expected 
value  of  k(h)  is  Geographic  clustering  will  yield  values  of  the  ^-function  that  are 

greater  than  this,  since  clustering  will  result  in  more  pairs  of  points  separated  by  a 
distance  of  h  than  would  be  expected  in  a  random  pattern. 

We  used  the  difference  between  A:-functions  for  cases  and  controls  to  compare  two 
patterns  (i.e.,  D{h)  =  kcase  (h)  -  kcontroiQi)).  Positive  values  of  DQi)  indicate  spatial 
clustering  of  cases  relative  to  the  spatial  clustering  of  controls.  Under  the  null  hypothesis 
of  random  labeling  of  cases  and  controls,  the  expected  value  of  D{h)  is  zero,  indicating 
that  the  ^-functions  of  the  cases  and  controls  are  the  same.  The  test  statistic,  D{h),  was 
calculated  with  confidence  envelopes  using  the  splancs  library  in  S-plus}^  We  obtained 

the  approximate  95%  confidence  limits  for  two  standard  errors  (±2^J[Var{D(h)}] )  at  the 
a  =  .05  level.’^  When  the  estimated  function  DQi)  deviated  from  zero  by  greater  than  two 
standard  deviations,  we  interpreted  this  as  a  statistically  significant  difference  between 
the  case  and  control  patterns. 

We  also  employed  a  spatial  clustering  method  to  identify  significant  geographic 
clusters  of  breast  cancer  cases.  The  spatial  scan  statistic*^,  which  considers  the  likelihood 
of  observing  the  actual  number  of  cases  inside  of  the  circle  under  the  null  hypothesis  of 
no  clustering,  was  applied  to  residence  at  early  life  events.  We  were  mainly  interested  in 
spatial  clustering  of  high  rates,  and  employed  the  Bernoulli  model  based  on  the  locations 
of  individual  cases  and  controls.'*  In  addition,  odds  ratios  (OR)  and  95%  confidence 
intervals  (95%  Cl)  were  obtained  using  logistic  regression,  adjusting  for  age,  education. 
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age  at  menarche,  parity,  history  of  benign  breast  disease,  family  history  of  breast  cancer. 
All  analyses  were  conducted  for  the  entire  group  of  study  participants  and  for  data 
stratified  on  menopausal  status.  Women  were  considered  postmenopausal  if  their  menses 
had  ceased  permanently  and  naturally.  Among  other  women,  participants  were  also 
considered  postmenopausal  if  any  of  the  following  conditions  were  true;  they  were  on 
hormone  replacement  therapy  and  were  over  age  55,  they  had  had  a  bilateral 
oophorectomy,  they  had  had  a  hysterectomy  without  removal  of  the  ovaries  and  they 
were  older  than  50,  their  menses  had  ceased  permanently  due  to  radiation  or  other 
medical  treatment  and  they  were  older  than  55. 

Results 

Characteristics  of  subjects  included  in  the  analysis,  subjects  with  missing 
residential  information,  and  subjects  excluded  due  to  residence  outside  of  Erie  and 
Niagara  counties,  are  shown  in  Table  2.  About  half  of  the  sample  was  excluded  for  each 
time  period;  the  highest  percentage  of  ineligible  cases  and  controls  was  at  the  birth 
residence  (46%  and  51%  respectively).  However,  we  found  little  difference  in 
characteristics  between  those  subjects  included  and  those  subjects  with  addresses  outside 
of  these  two  counties. 

Mapping  was  used  to  identify  geographic  patterns  of  breast  cancer  cases  and 
controls  for  each  of  the  early  life  events.  Maps  showing  the  locations  of  cases  and 
controls  in  Figure  1  portray  the  underlying  geographic  patterns  of  breast  cancer  cases  and 
controls  in  the  study  area.  The  rectangular  region  was  used  instead  of  the  actual  county 
boundary  as  an  approximate  boundary  of  the  study  area  to  protect  individuals’ 
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confidentiality.  The  purpose  of  such  mapping  is  to  inspect  patterns  visually  —  the  first 
step  in  any  spatial  analysis.  Geographic  patterns  do  not  appear  to  vary  much  from  one 
time  period  to  the  next,  and  they  appear  to  reflect  patterns  of  population  distribution  in 
the  study  area.  However,  it  is  difficult  to  determine  whether  they  were  clustered  or 
dispersed  relative  to  population  from  visual  inspection  alone,  because  of  the  large  number 
of  data  points. 

To  assess  potential  effects  of  geographic  selection  bias  in  our  study,  we  also 
examined  the  distribution  of  current  residence  in  relation  to  other  population  data  on 
geographic  distribution  of  breast  cancer  cases  and  the  general  population.  We  did  not  find 
differences  in  the  geographic  distribution  of  participating  and  non-participating  cases,  nor 
between  controls  and  the  underlying  population,  except  some  tendency  for  both  cases  and 
controls  living  closer  to  the  interview  site  to  be  somewhat  more  likely  to  participate  than 
those  living  further  away. 

Spatial  clustering  of  residences  associated  with  early  life  events 

We  obtained  differences  between  the  case  and  control  patterns  for  locations 
associated  with  each  early  life  event.  The  ^-function  differences  for  values  of  /z  up  to  15 
miles,  with  approximate  95  %  confidence  envelopes,  are  shown  in  Figure  2.  The 
maximum  value  of  h  is  generally  taken  as  one-third  of  the  linear  extent  of  the  study 
area.*^  Any  patterns  beyond  this  scale  can  be  disregarded,  since  either  peaks  or  troughs  in 
this  geographic  scale  are  difficult  to  interpret,  and  are  potentially  misleading.  Figure  2a 
shows  it-function  differences  for  birth  residence.  It  is  clear  that  the  estimated  function 
shows  strong  evidence  of  spatial  clustering,  that  is,  of  clustering  of  cases  relative  to 
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controls.  There  was  no  significant  difference  up  to  3  miles;  statistically  significant 
differences  were  detected  beyond  the  scale  of  3  miles.  There  is  also  evidence  of  some 
degree  of  clustering  for  breast  cancer  cases  at  menarche  residence  (Figure  2b).  Estimates 
of  the  D-function  are  positive  but  not  statistically  significant  up  to  7  miles;  spatial 
clustering  of  breast  cancer  cases  occurs  at  a  scale  of  about  7-15  miles.  For  residence  at 
women’s  first  birth  and  for  current  residence,  the  difference  is  not  statistically  significant; 
the  plot  falls  within  the  confidence  interval  over  all  distances  (Figures  2c  and  2d). 

To  determine  whether  there  are  any  differences  in  clustering  patterns  by 
menopausal  status,  the  ^-function  difference  was  performed  for  premenopausal  and 
postmenopausal  women  separately  (Figure  3).  We  found  significant  clustering  of 
premenopausal  breast  cancer  cases  compared  to  controls  for  both  birth  and  menarche 
residence  (Figures  3a),  while  there  is  no  evidence  of  clustering  for  postmenopausal  breast 
cancer  cases  for  either  period  (Figures  3b).  We  did  not  find  evidence  of  clustering  for 
first  birth  and  current  residence  (at  diagnosis)  for  either  group  (not  shown).  Estimated 
functions  at  birth  residence  show  a  strong  clustering  of  premenopausal  cases  over  the 
entire  geographic  scale  with  a  peak  at  7  miles.  Values  are  positive  for  post-menopausal 
cases,  but  not  statistically  significant.  For  menarche  residence,  we  also  observed  a  strong 
clustering  of  premenopausal  cases  with  a  peak  at  about  8-10  miles.  Again  differences  are 
not  statistically  significant  for  postmenopausal  women  at  menarche  residence. 

Identifying  the  geographic  location  of  breast  cancer  clusters 

To  identify  the  geographic  location  of  areas  with  higher  intensities  for  pre¬ 
menopausal  cases  in  the  study  area,  the  spatial  scan  statistic  was  applied  to  residences  of 
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premenopausal  women  at  the  time  of  birth  and  menarche.  Maps  in  Figure  4  present 
results  of  the  clustering  analysis.  The  circle  in  Figure  4a  indicates  clustering  of  birth 
residence  for  premenopausal  cases  when  compared  to  controls.  We  found  a  circular 
cluster  of  birth  residence  for  breast  cancer  cases  with  a  5.7  mile  radius  in  the  area 
including  part  of  the  city  of  Buffalo,  and  the  towns  of  Amherst,  Cheektowaga,  and 
Tonawanda  (shaded  areas).  There  are  100  observed  breast  cancer  cases  inside  the  cluster, 
while  76  breast  cancer  cases  are  expected.  The  cluster  was  significant  at  <0.01  with  999 
Monte  Carlo  simulations. 

Further,  we  examined  breast  cancer  risk  associated  with  residence  in  the  cluster  at 
the  time  of  birth.  When  we  compared  other  breast  cancer  risk  factors,  such  as  age, 
education,  and  age  at  menarche,  for  the  premenopausal  breast  cancer  cases  and  controls 
whose  birth  residence  was  inside  the  cluster  to  those  who  lived  outside  of  cluster,  we  did 
not  find  significant  differences  between  the  two  groups  (data  not  shown).  We  observed  an 
elevated  breast  cancer  risk  for  premenopausal  women  living  in  the  cluster  at  the  time  of 
birth.  With  subjects  living  outside  the  cluster  as  a  reference  group,  the  adjusted  odds  ratio 
was  2.65  (95%  Cl  1 .75-4.0)  after  controlling  for  age,  education,  age  at  menarche,  parity, 
history  of  benign  breast  disease,  and  family  history  of  breast  cancer. 

We  also  identified  clustering  of  menarche  residence  for  premenopausal  women 
and  obtained  similar  results  as  for  birth  residence.  We  were  able  to  identify  a  small 
clustering  of  menarche  residences  for  premenopausal  breast  cancer  cases.  A  small  cluster 
in  the  center  of  those  four  towns  was  detected  (Figure  4b).  It  is  a  small-sized  cluster  with 
0.8  mile  radius  and  is  statistically  significant  at p<0.05.  The  cluster  contains  9  observed 
and  3.1  expected  breast  cancer  cases,  yielding  a  relative  risk  (ratio  of  observed  to 
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expected  breast  cancer  cases)  of  2.9.  A  secondary  cluster  was  also  detected  near  the  city 
of  Buffalo.  It  has  a  three  mile  radius  and  relative  risk  of  1.38  with  65  observed  and  47 
expected  breast  cancer  cases,  but  it  is  not  statistically  significant  (p=0.38). 


Discussion 

To  our  knowledge,  no  other  studies  have  examined  clustering  of  residential 
locations  associated  with  cancer  during  early  life:  studies  have  examined  clustering  of 
residential  locations  at  the  time  of  diagnosis  or  death.^®  Critical  time  periods,  including 
birth,  menarche,  and  women’s  first  pregnancy,  as  important  early  life  and  reproductive 
events  in  women’s  life,  may  play  a  substantial  role  in  the  risk  of  breast  cancer.  Under  the 
hypothesis  that  there  may  be  sensitive  time  periods  in  women’s  lives  that  will  carry 
greater  risk  for  exposure,  the  essential  question  was  whether  cases  were  more  clustered 
than  the  underlying  population,  as  represented  by  the  controls.  We  found  that  cases  were 
more  clustered  than  controls  at  the  time  of  birth  and  menarche,  and  it  was  due  to 
clustering  of  residence  for  pre-menopausal,  but  not  for  post-menopausal  breast  cancer. 
The  evidence  for  clustering  of  residential  locations  at  birth  and  menarche  was  stronger 
than  evidence  for  clustering  at  the  time  of  women’s  first  birth  or  other  time  periods  in 
adult  life.  Our  findings  suggest  that  there  may  be  identifiable  etiological  processes 
linking  exposure  and  breast  cancer  risk,  especially  for  premenopausal  women,  and  that 
early  exposures  may  be  of  particular  importance. 

This  study  provided  a  unique  opportunity  to  examine  clustering  of  breast  cancer 
cases  and  controls  at  various  points  during  early  life.  The  facts  that  the  study  area  had  a 
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relatively  stable  population  and  about  forty  percent  of  study  participants  v^^ere  lifetime 
residents,  made  the  results  more  reliable.  The  evidence  that  residence  in  early  life  was 
important  in  the  geographical  clustering  of  breast  cancer  cases  may  be  of  particular 
importance  for  understanding  environmental  determinants  of  breast  cancer.  These 
findings  suggest  the  importance  of  early  or  lifetime  exposure  in  relation  to  disease  risk  in 
adult  life,  and  also  the  potential  role  of  the  effects  of  migration  on  exposures  and  disease 
risk.  Although  migration  can  have  a  serious  effect  on  the  detection  of  geographical 
differences  in  disease  risk,  it  has  not  been  adequately  addressed  in  previous  clustering 
analyses.^'  Further  investigations  are  required  to  prove  any  relationship  between 
geographic  clustering  of  residence  and  breast  cancer  risk,  and  the  effects  of  residential 
changes  on  exposures  should  be  considered  in  these  studies. 

Our  finding  of  clustering  was  restricted  to  premenopausal  breast  cancer.  We 
stratified  on  menopausal  status  because  of  evidence  that  there  were  differences  in  risk 
factors  for  pre-  and  postmenopausal  women.^^  The  mechanism  of  the  observed  difference 
is  not  clear.  It  could  be  that  early  life  exposures  impact  premenopausal  more  than 
postmenopausal  disease  because  of  greater  temporal  proximity.  There  is  some  evidence, 
though  not  consistent,  that  other  early  exposures  may  differ  by  menopausal  status.  For 
example,  there  are  data  suggesting  that  birthweight  may  be  more  associated  with  pre- 
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than  with  postmenopausal  breast  cancer.  ’ 

The  results  should  be  interpreted  cautiously  due  to  the  fact  that  there  may  be  some 
artifacts  of  the  analysis.  First,  it  is  important  to  note  that  spatial  point  patterns  are 
complex  to  summarize  in  a  single  way.^'*  For  example,  the  use  of  cumulative  scales  in  the 
application  of  the  ^-function  method  may  influence  the  outcome.^^  In  particular. 
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clustering  is  more  likely  to  be  detected  on  a  larger  geographic  scale,  and  it  tends  to  show 
continuous  patterns  over  several  neighboring  scales  due  to  the  fact  that  the  geographical 
scales  are  cumulative.  Further  refinement  of  methods  to  summarize  spatial  point  patterns 
may  provide  more  reliable  results,  as  well  as  more  accurate  estimates  of  disease  risk. 

Second,  this  study  is  limited  to  current  residents  in  the  study  area  because  we 
focused  on  the  residential  environment  of  Erie  and  Niagara  counties;  participants  residing 
outside  of  these  two  counties  at  the  time  of  each  early  life  event  were  not  included.  The 
existence  of  missing  residential  information  and  potential  selection  bias  due  to  non¬ 
participation  may  influence  the  results.  As  noted,  we  found  no  difference  in  participation 
by  residence  for  cases  compared  to  controls.  Further  we  would  expect  that  our  findings  on 
the  clustering  of  early-life  residence  would  be  less  subject  to  potential  geographie 
selection  bias  than  would  current  residence.  We  found  a  greater  degree  of  clustering  for 
residence  at  early  life,  than  for  current  residential  location. 

Further,  the  fact  that  residence  at  birth  and  menarche  were  often  the  same  made 
it  difficult  to  differentiate  associations  for  the  two  time  periods.  For  22%  of  cases  and 
35%  of  controls,  the  menarche  residence  was  the  same  as  their  birth  residence.  While  the 
observed  tendencies  may  be  related  to  environmental  exposures,  it  is  also  possible  that 
clustering  of  residenee  at  the  time  of  birth  or  menarche  may  be  due  to  clustering  of  other 
socioeconomic  or  demographic  factors.  Evaluation  of  the  contribution  of  socioeconomic 
status  to  clustering  of  residences  at  birth  and  menarche  is  of  special  interest.  There  may 
be  other  factors  associated  with  residence  not  measured  in  this  study.  The  findings  are 
still  of  interest  for  further  study  in  order  to  understand  what  those  exposures  might  be. 

We  are  now  investigating  the  relation  between  spatio-temporal  clustering  of  residences 
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and  exposures  to  environmental  compounds,  such  as  PAHs  and  benzene,  to  provide 
epidemiologic  evidence  of  this  finding. 

Since  the  publication  of  John  Snow’s^^  well-known  cholera  map  for  the  city  of 
London  in  the  19th  century,  the  relationship  between  the  environment  and  disease  has 
been  one  of  the  major  research  themes  in  medical  geography.  Geographic  perspectives 
are  of  great  use  in  describing  geographical  patterns  of  diseases,  generating  hypotheses  on 
disease  etiology,  monitoring  high  risk  areas  of  disease  incidence,  and  suggesting  possible 
causal  factors  of  particular  disease.^’’  Our  study  demonstrated  that  these  GIS-based 
clustering  analyses  provide  effective  ways  to  explore  spatial-temporal  patterns  of 
clustering.  The  findings  show  consistent  results;  the  cluster  identified  by  spatial  analyses 
remained  significant  when  traditional  epidemiologic  methods  were  used,  and  it  was  not 
explained  by  potential  confounders.  A  recent  study  comparing  “traditional” 
epidemiological  methods,  GIS,  and  point  pattern  analysis  for  use  in  the  spatially 
referenced  public  health  data  concluded  that  results  complement,  rather  than  contradict  or 
duplicate  each  other.^^ 

In  summary,  this  analysis  of  breast  cancer  clustering  in  space  provides  evidence 
of  geographic  clustering  of  premenopausal,  but  not  postmenopausal,  breast  cancer  cases 
at  the  time  of  birth  and  menarche,  suggesting  a  possible  influence  of  exogenous  risk 
factors  on  breast  cancer  at  these  time  points.  While  it  is  not  clear  from  these  data  what 
caused  this  spatial  clustering,  it  is  provocative  in  providing  evidence  of  the  importance  of 
this  early  period  in  breast  carcinogenesis.  Further  investigations  on  genetic  susceptibility 
may  be  of  relevance  to  identify  different  effects  on  pre-  and  postmenopausal  breast 
cancer.  It  will  also  be  meaningful  to  see  whether  there  is  temporal  clustering  of  early-life 
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residences  as  well  as  spatial  clustering.  This  type  of  study  also  needs  to  be  replicated  in 
other  settings. 
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Table  1.  Residential  history  of  breast  cancer  cases  and  controls:  numbers  and  percentage 
of  complete  and  missing  residences  in  Erie  and  Niagara  counties:  WEB  Study,  1996- 
2001. 


Complete  Incomplete  or  Missing  Total  eligible  Erie  and 
Residence  Residence  Niagara  county  residence  at 

each  time  period 


Case 

Control 

Case 

Control 

Case 

Control 

Birth 

505 

804 

127 

189 

632 

993 

(79.9%) 

(81%) 

(20.1%) 

(19%) 

Menarche 

673 

1143 

98 

154 

771 

1297 

(87.3%) 

(88.1%) 

(12.7%) 

(11.9%) 

First  birth 

616 

1153 

97 

167 

713 

1320 

(86.4%) 

(87.3%) 

(13.6%) 

(12.7%) 
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Table  2.  Characteristics  of  subjects  included  in  the  analysis,  subjects  with  missing 
residential  information,  and  subjects  excluded  due  to  residence  outside  of  the  study  area 


(MeaniSD);  WEB 

Study,  1996 

-2001. 

Cases  (/7=1166) 

Controls  («= 

=2105) 

Birth 

Included 

Missing 

Ineligible* 

Included 

Missing 

Ineligible* 

(«=505) 

(«=127) 

(n=534) 

(«=804) 

(«=189) 

(«=1112) 

Age  (years) 

56.5±10.9 

60.0±11.0 

58.^11.3 

55.6±11.7 

58.0±11.8 

59.4±11.7 

Education  (years) 

13.5±2.4 

13.1±2.5 

13.6±2.7 

13.4±2.2 

13.2±2.2 

13.3±2.5 

Parity 

2.2±1.5 

2.4±1.7 

2.4±1.8 

2.6±1.8 

2.7±1.8 

2.8±1.8 

Age  at  menarche 

12.4±1.5 

12.6±1.5 

12.7±1.7 

12.7±1.7 

12.6±1.6 

12.7±1.7 

(years) 

Age  at  first  birth 

24.3±4.6 

23.5±4.5 

24.2±5.1 

24.5±4.3 

23.5±4.2 

24.0±4.7 

(years) 

Premenopausal  (%) 

35.2 

18.9 

26.4 

31.7 

28.6 

24.6 

Body  Mass  Index 

28.2±6.4 

28.4±5.8 

28.7±6.4 

28.0±6.2 

28.2±6.0 

28.4±6.4 

Family  history  of 

21.3 

18.9 

20.2 

12.7 

16.2 

12.4 

breast  cancer  (%  yes) 

History  of  benign 

34.9 

37.0 

32.8 

22.3 

25.9 

20.6 

breast  disease  (%  yes) 

Menarche 

Included 

Missing 

Ineligible* 

Included 

Missing 

Ineligible* 

(n=673) 

(n=98) 

(n=395) 

(/j=1143) 

(«=154) 

(«=808) 

Age  (years) 

56.6±10.7 

60.1±I1.6 

59.5±11.3 

56,0±11.7 

60.2±11.7 

59.9±11.6 

Education  (years) 

13.5±2.4 

12.8±2.6 

13.6±2.8 

13.4±2.2 

13.0±2.3 

13.3±2.6 

Parity 

2.2±1.6 

2.8±1.8 

2.5±1.8 

2.6±1.8 

2.9±2.1 

2,9±1.8 

Age  at  menarche 

12.5±1.6 

12.8±1.5 

12.7±1.7 

12.7±1.6 

12.6±1.7 

12.7±1.7 

(years) 

Age  at  first  birth 

24.3±4.6 

23.0±4.3 

24.2±5.3 

24.4±4.5 

23.8±4.4 

24.0±4.6 

(years) 

Premenopausal  (%) 

30.3 

24.5 

24.6 

33.8 

23.4 

23.3 

Body  Mass  Index 

28.1±6.2 

29.5±6.4 

28.7±6.5 

28.3±6.5 

27.6±5.5 

28.2±6.1 

Family  history  of 

20.2 

22.4 

20.6 

13.1 

13.2 

12.1 

breast  cancer  (%  yes) 

History  of  benign 

34.5 

40.8 

31.9 

22.3 

19.5 

21.3 

breast  disease  (%  yes) 

First  Birth 

Included 

Missing 

Ineligible* 

Included 

Missing 

Ineligible* 

(«=616) 

(«=97) 

(n=453) 

(«=1153) 

(n=167) 

(«=785) 

Age  (years) 

57.4±11.1 

58.9±10.8 

58.5±11.2 

57.0±11.7 

60.6±10.7 

58.5±12.0 

Education  (years) 

13.4±2.3 

13.0±2.9 

13.7±2.8 

13.3±2.2 

13.0±2.1 

13.4±2.6 

Parity 

2.7±1.3 

3.1±1.5 

1.7±1.9 

3.0±1.5 

3.4±1.7 

2.2±2.0 

Age  at  menarche 

12.6±1.5 

12.5±1.8 

12.6±1.6 

12.7±1.6 

12,6±1.5 

12.7±1.7 

(years) 

Age  at  first  birth 

24.8±4.8 

22.2±4.1 

23.4±4.9 

24.7±4.6 

22.9±3.5 

23.3±4.4 

(years) 

Premenopausal  (%) 

29.4 

26.8 

26.0 

32.2 

18.0 

26.6 

Body  Mass  Index 

28.4±6.3 

30.1±6.6 

28.2±6.3 

28.1±6.1 

28.3±6.5 

28.4±6.4 

Family  history  of 

21.2 

23.7 

18.6 

11.5 

19.4 

13.6 

breast  cancer  (%  yes) 

History  of  benign 

34.7 

37.1 

32.7 

21.0 

25.1 

22.0 

breast  disease  (%  yes) 

*  Ineligible  due  to  residence  outside  of  Erie  and  Niagara  county 
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Figure  3.  A:-function  differences  in  clustering  patterns  between  breast  cancer  cases  and  controls  by  menopausal  status,  WEB  Study, 


Distance  (miles)  Distance  (miles) 


Figure  4.  Geographic  clustering  of  residence  at  birth  and  menarche:  premenopausal  breast  cancer,  WEB  Study,  1996-2001 
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Abstract 

Polycyclic  aromatic  hydrocarbons  (PAHs)  are  ubiquitous  in  the  environment.  We 
hypothesized  that  early  life  exposure  to  PAHs  may  have  particular  importance  in  the 
etiology  of  breast  cancer.  We  conducted  a  population-based,  case-control  study  of 
ambient  PAH  exposure  in  early  life  in  relation  to  the  risk  of  breast  cancer.  Total 
suspended  particulates  (TSP),  a  measure  of  ambient  air  pollution,  was  used  as  a  proxy  for 
PAH  exposure.  Cases  (n=l,166)  were  women  with  histologically-confirmed,  primary, 
incident  breast  cancer.  Controls  (n=2,105)  were  frequency  matched  by  age,  race,  and 
county  of  residence  to  cases.  Annual  average  TSP  concentrations  (1959-1997)  by 
location  were  obtained  from  the  New  York  State  Department  of  Environmental 
Conservation  for  Erie  and  Niagara  Counties.  Based  on  the  monitor  readings,  prediction 
maps  of  TSP  concentrations  were  generated  with  ArcGIS  8.0  (ESRJ,  Inc.,  Redlands,  CA) 
using  inverse  distance  squared  weighted  interpolation.  Unconditional  logistic  regression 
was  used  to  estimate  odds  ratios  (OR)  and  95%  confidence  intervals  (95%  Cl).  In 
postmenopausal  women,  exposure  to  high  concentrations  of  TSP  (>140  pg/m  )  was 
associated  with  an  adjusted  OR  of  2.42  (95%  CI=0.97-6.09)  compared  with  exposure  to 
low  concentrations  (<84  pg/m^).  However,  in  premenopausal  women,  where  exposures 
were  generally  lower,  the  results  were  inconsistent  with  our  hypothesis  and  in  some 
instances  were  suggestive  of  a  reduction  in  the  risk  of  breast  cancer.  Our  study  suggests 
that  exposure  in  early  life  to  high  levels  of  PAHs  may  increase  the  risk  of 
postmenopausal  breast  cancer;  however  other  confounders  related  to  geography  cannot  be 


ruled  out. 
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Introduction 

Polycyclic  aromatic  hydrocarbons  (PAHs)  are  ubiquitous  in  the  environment  and 
commonly  present  in  particulate  air  pollution  (7,  2).  PAHs  are  a  broad  category  of 
chemical  compounds  composed  of  carbon  and  hydrogen  and  are  formed  as  a  by-product 
during  combustion  of  organic  materials.  Important  sources  of  PAHs  include  cigarette 
smoke,  steel  mills,  foundries,  automobiles,  coal  combustion  for  electricity  production  and 
many  other  industrial  and  non-industrial  processes.  PAHs  are  also  found  in  food  and  are 
formed  when  food  is  cooked  at  high  temperatures  (i.e.,  grilling  meats).  In  addition  to 
anthropogenic  sources,  natural  sources  (i.e.,  volcanoes  and  forest  fires)  also  contribute 
PAHs  to  the  atmosphere  (1,  3).  Most  of  these  sources  not  only  contribute  to  the  release 
of  PAHs  into  the  environment,  but  also  contribute  to  particulate  air  pollution.  Ninety  to 
95%  of  particulate  phase  PAHs  are  physically  associated  with  particulate  matter  less  than 
3.3  pm  (2,  4).  These  small  particles  are  thought  to  have  particular  biologic  relevance 
since  they  can  be  inhaled  and  deposited  in  the  lower  respiratory  tract.(5)  PAHs  are 
lipophilic  (6,  7)  They  have  been  shown  to  be  mammary  carcinogens  in  animal  models  (1, 
8,  9).  and  there  is  evidence  that  they  may  also  be  human  mammary  carcinogens  (10,  11). 
In  addition,  PAHs  may  also  have  estrogenic  and  antiestrogenic  properties  that  could 
potentially  affect  breast  cancer  risk  (12). 

No  studies  have  examined  exposure  to  total  suspended  particulates  and  breast  cancer 
risk  and  only  a  few  epidemiologic  investigations  of  breast  cancer  have  examined  PAHs. 
Petralia  and  colleagues  (12)  examined  premenopausal  breast  cancer  and  occupational 
exposure  to  benzene  and  PAHs  using  job-exposure  matrices  in  a  population-based,  case- 
control  study.  High  probability  of  occupational  exposure  to  benzene  and  PAHs  was 
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associated  with  premenopausal  breast  cancer.  However,  because  women  were  exposed  to 
a  mixture  of  compounds,  it  was  difficult  in  that  study  to  estimate  whether  PAHs  had  an 
independent  effect  on  breast  cancer  risk. 

Rundle  et  al.(i7)  examined  PAH-DNA  adducts  in  breast  tumor  tissue.  They  found  a 
2-fold  increase  in  PAH-DNA  adducts  in  malignant  tumors  compared  with  tissue  from 
controls  with  benign  breast  disease  with  atypia.  Gammon  et  al.(70)  examined  PAH-DNA 
adducts  in  mononuclear  cells  in  relation  to  the  risk  of  breast  cancer  in  a  case-control 
study  of  Long  Island  residents.  They  found  a  nearly  50%  increase  in  the  risk  of  breast 
cancer  for  subjects  in  the  highest  quintile  of  PAH-DNA  adducts  in  mononuclear  cells; 
there  was  no  dose-response  relationship. 

Early  life  exposures,  including  exposure  to  PAHs,  may  have  particular  importeince 
in  the  etiology  of  breast  cancer  {14).  Early  age  at  exposure  to  ionizing  radiation,  for 
example,  confers  increased  risk  of  breast  cancer  when  compared  with  later  age  at 
exposure  (75, 16).  In  addition,  several  other  established  risk  factors  also  indicate  the 
importance  of  early  life  factors  in  the  etiology  of  breast  cancer.  Breast  cancer  risk  is 
increased  in  women  with  earlier  age  at  menarche,  whereas  earlier  age  at  first  birth 
reduces  the  risk  of  breast  cancer.  The  physiological  changes  that  occur  to  breast  tissue 
during  development  further  support  the  postulation  that  early  life  exposures  may  be 
important.  Tkround  menarche,  the  mammary  gland  begins  to  develop  and  differentiate 
into  defined  ducts  and  lobules.  The  primary  lobules  formed  at  this  time  are  type  1 
lobules.  These  lobules  further  differentiate  into  type  2  and  type  3  lobules  during 
pregnancy  (7  7).  In  vitro  studies  have  shown  that  cells  from  type  1  lobules  are  more 
sensitive  to  proliferation  signals  than  either  cells  from  type  2  or  3  lobules  {18).  In 
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addition,  human  breast  epithelial  cells  from  type  1  lobules  were  more  sensitive  to  the 
transforming  effects  of  the  PAH,  7,12-dimethlybenzo(a)anthracene  and  N-methyl-N- 
nitrosourea  than  were  type  3  lobule  cells  {19). 

We  conducted  a  population-based,  case-control  study  of  PAH  exposure  in  early  life  in 
relation  to  the  risk  of  breast  cancer  using  total  suspended  particulates  (TSP),  a  measure  of 
ambient  air  pollution,  as  a  proxy  for  PAH  exposure.  We  examined  time  periods  that  are 
thought  to  be  critical  exposure  periods  with  regards  to  susceptibility  to  breast  cancer:  at 
the  time  of  birth,  at  menarche,  and  at  the  time  of  when  the  participant  first  gave  birth. 

Materials  and  Methods 

The  Western  New  York  Exposures  and  Breast  Cancer  Study  (WEB  Study)  is  a 
population-based,  case-control  study  was  conducted  with  women  living  in  Erie  and 
Niagara  Counties  in  Western  New  York  State  during  1996-2001.  All  participants  were 
aged  35-79  years.  Cases  included  1,166  women  with  histologically-confirmed,  primary, 
incident  breast  cancer.  In  addition,  cases  under  the  age  of  65  years  were  restricted  to 
women  with  a  driver’s  license.  Controls  (n=2,105)  were  frequency  matched  by  age,  race, 
and  county  of  residence  to  cases.  Controls  under  the  age  of  65  years  were  randomly 
selected  from  the  New  York  State  Department  of  Motor  Vehicles  driver’s  license  list  and 
controls  65  years  of  age  and  over  were  randomly  seleeted  from  the  Healthcare  Financing 
Administration  Medicare  rolls.  For  these  analyses,  cases  and  controls  were  restricted  to 
participants  who  were  residents  of  Erie  and  Niagara  Counties  during  each  of  the  three 
pertinent  time  periods;  birth,  menarche,  and  first  birth.  A  total  of  1,638  cases  and  3,396 
controls  met  our  inclusion  criteria  of  between  35-79  years  of  age,  current  resident  of  Erie 
or  Niagara  County,  no  previous  cancer  diagnosis  other  than  non-melanoma  skin  cancer 
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and  an  ability  to  speak  English.  The  response  rates  were  71%  (1,166/1,638)  and  62% 
(2,105/3,396)  for  cases  and  controls,  respectively.  All  participants  provided  informed 
consent;  the  protocol  was  approved  by  the  Institutional  Review  Boards  of  the  University 
at  Buffalo  School  of  Medicine  and  Biomedical  Sciences  and  of  participating  hospitals. 

Data  Collection 

Using  extensive  in-person  interviews  and  self-administered  questionnaires, 
participants  provided  information  regarding  medical  history,  diet,  alcohol  consumption, 
smoking  history,  lifetime  passive  smoke  exposure,  occupational  history,  and  residential 
history.  Residential  histories  were  reported  by  the  subject  dating  back  to  birth.  For 
addresses  in  Erie  and  Niagara  Counties,  Polk  and  city  directories  were  searched  to  find 
missing  address  information.  For  addresses  with  missing  zip  codes,  we  used  ZP4 
(Semaphore  Corporation,  Aptos,  CA),  a  commercially  available  database  that  uses 
information  about  street  name  and  number  and  city  designation  to  find  missing  zip  codes. 
Residential  histories  and  interview  data  were  used  to  identify  each  subject’s  residence  at 
her  birth,  menarche,  and  her  first  birth.  These  addresses  were  geocoded  with  ArcView 
3.2  (ESRI,  Inc.,  Redlands,  CA)  using  Dynamap  2000  (GDT  Inc.,  Lebanon,  NH)  as  the 
reference  theme  (i.e.,  street  map)  of  Erie  and  Niagara  Counties. 

Exposure  Assessment 

The  New  York  State  Department  of  Environmental  Conservation  maintains  air 
monitors  that  began  measuring  total  suspended  particulates  (TSP)  in  1959.  These 
monitors  measured  TSP  concentrations  every  seven  days.  Annual  average  TSP 
concentrations  (1959-1997)  were  obtained  from  these  monitors  for  Erie  and  Niagara 
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Counties.  In  total,  87  monitors  were  operating  at  various  times  in  Erie  and  Niagara 
Counties.  For  the  period  of  the  1960’s,  there  were  fewer  monitors  operating  than  at  later 
time  periods.  There  was  very  little  within  monitor  variation  of  TSP  concentration  during 
this  time  period  and  average  TSP  concentrations  were  calculated  for  the  entire  decade  for 
each  monitor.  By  averaging  the  TSP  concentrations  for  each  monitor,  the  overall  TSP 
estimates  were  more  stable.  Considerably  more  monitors  were  operating  in  the  years  after 
1969.  Annual  average  TSP  concentrations  were  calculated  for  each  year  for  1970 
through  1997  for  each  monitor.  In  addition  to  TSP,  ambient  benzo(a)pyrene  (B(a)P)  was 
measured  between  November  1,  1973  and  November  1,  1974  in  Erie  County,  New  York 
for  1 1  of  the  87  monitoring  sites.  The  Pearson  correlation  coefficient  between  the 
measured  log  transformed  TSP  and  log  transformed  B(a)P  concentrations  at  these  1 1 
monitoring  sites  was  0.90,  suggesting  that  the  ambient  TSP  concentrations  reasonably 
estimate  ambient  PAHs  concentrations  in  this  region. 

Based  on  the  monitor  readings  for  each  time  period,  prediction  maps  of  TSP 
concentrations  were  generated  with  ArcGIS  8.0  (ESRI,  Inc.,  Redlands,  CA)  using  inverse 
distance  squared  weighted  interpolation.  We  assumed  a  45-degree  angle  to  account  for 
the  prevailing  southwesterly  winds  and  limited  the  exposure  estimation  for  each  address 
to  the  seven  closest  sampling  monitors.  The  primary  assumption  of  these  geostatistical 
methods  is  that  close  locations  are  more  similar  to  one  another  than  are  locations 
relatively  farther  away.(20)  The  estimated  individual  residential  TSP  concentrations 
were  insensitive  to  changing  the  number  of  monitors  included  for  the  exposure 
estimation.  In  total,  29  prediction  maps  were  constructed  of  estimated  TSP 
concentrations  for  the  two-county  region;  one  for  the  1960’s  and  one  for  each  year  after 
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that  until  1997.  These  maps  were  used  to  determine  exposure  to  TSP  at  each 
participant’s  address  for  the  relevant  time  period.  The  1960’s  TSP  concentration 
prediction  map  is  provided  as  an  example  in  figure  1 . 

TSP  concentrations  for  addresses  before  the  1960s  were  estimated  assuming  that  the 
interpolated  concentrations  in  the  1960s  were  representative  of  earlier  time  periods. 
Industrialization  in  Erie  and  Niagara  Counties  began  at  the  end  of  the  1 9  century  and  the 
industrial  activities  that  contributed  most  heavily  to  air  pollution  were  very  active  prior  to 
the  1960’s  and  was  relatively  constant  over  the  time  period  (21).  Further,  measures  to 
control  air  quality  were  not  implemented  until  the  early  1970’s.  Consequently,  the 
1960’s  concentrations  of  TSP  probably  reflect  ambient  levels  in  the  earlier  time  period. 

Statistical  Analysis 

Unconditional  logistic  regression  (22)  was  used  to  estimate  odds  ratios  (OR)  and  95% 
confidence  intervals  (95%  Cl).  TSP  concentrations  were  categorized  into  4  levels  (<84 

o  O  n  n 

ug/m  ,  84-1 14  ug/m  ,  1 15-140ug/m  ,  and  >140ug/m  ).  The  cut  points  for  the  categorical 
analyses  were  derived  from  the  quartiles  of  the  distribution  of  measurements  of  TSP 
concentrations  in  the  1960s.  In  addition  to  the  categorical  analysis,  we  examined  TSP 
concentrations  on  a  continuous  scale.  Further,  logistic  quadratic  spline  regression  with 
knots  at  84  pg/m  and  140  pg/m  was  used  to  graphically  depict  the  exposure-response 
trend;  the  estimated  probability  of  being  a  case  was  calculated  from  the  quadratic  spline 
regression  equation  and  adjusted  for  age,  education  and  parity.  The  values  for  the  two 
knots  in  the  spline  regression  were  selected  based  on  the  previous  categorical  analysis. 

The  end  categories  were  restricted  to  linear  segments  to  prevent  instability  (25). 
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We  considered  age,  race,  education,  age  at  first  birth,  age  at  menarche,  parity, 
previous  benign  breast  disease,  family  history  of  breast  cancer,  body  mass  index  (weight 
(kg)/height  (m)^),  and  age  at  menopause  as  potential  confounders  in  multivariate  logistic 
regression.  The  models  presented  include  age,  education,  and  parity  and  was  determined 
by  excluding  variables  from  the  full  model  which  did  not  alter  the  risk  estimates  more 
than  10%.  All  models  were  stratified  by  menopausal  status.  P  for  trend  statistics  were 
determined  by  the  p-value  for  the  coefficient  of  the  continuous  exposure  variable,  while 
adjusting  for  covariates. 

Results 

The  descriptive  characteristics  of  the  participants  included  in  the  birth,  menarche,  first 
birth,  and  the  overall  case-control  study  are  depicted  in  Table  1.  There  were  no  major 
differences  between  the  distributions  of  these  variables  between  each  time  period.  We 
were  able  to  successfully  geocoded  79%,  87%,  and  87%  of  the  Erie  and  Niagara  County 
birth,  menarche,  and  first  birth  addresses,  respectively. 

Exposure  to  concentrations  of  TSP  >84  pg/m^  at  the  time  of  birth  was  associated  with 
an  increase  in  the  odds  ratio  for  premenopausal  women  (Table  2),  however,  there  was  no 
exposure-response  relationship  and  the  P  for  trend  was  not  significant.  In  addition,  there 
were  relatively  few  participants  exposed  to  the  lowest  concentrations  of  TSP  and  this 
resulted  in  wide  confidence  intervals  for  the  corresponding  point  estimates.  In 
postmenopausal  women,  exposure  to  high  concentrations  of  TSP  (>140  pg/m^)  was 
associated  with  an  adjusted  OR  of  2.42  (95%  CI=0.97-6.09)  compared  with  exposure  to 
low  concentrations  (<84  pg/m  ).  For  risk  associated  with  estimated  residential  TSP 
concentrations  on  a  continuous  scale,  in  postmenopausal  women,  we  observed  a  21% 
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increase  in  the  odds  ratio  for  every  30  lagW  increase  in  TSP  concentration  (adjusted  OR 
=  1 .20,  95%  Cl  =  1 .04-1 .38).  In  the  spline  regression  analysis,  there  was  an  increase  in 
the  probability  of  being  a  case  with  an  increase  in  TSP  concentration  (Figure  2).  No 
increase  in  risk  was  observed  for  premenopausal  women  on  a  continuous  scale  (OR  = 
0.92,  95%  Cl  =  0.76-1 .1  Ifor  every  increase  in  30pg/m^  of  TSP).  Further,  the  spline 
regression  analysis  for  the  premenopausal  women  indicated  an  inverted  parabola 
exposure-response  relationship  with  increasing  TSP  concentration  (Figure  3). 

At  menarche,  exposure  to  high  concentrations  of  TSP  was  also  associated  with  a 
modest  increase  in  the  odds  ratio  for  postmenopausal  women  with  exposure  >  84  pg/m^, 
although  the  P  for  trend  was  not  significant  (Table  3).  In  the  continuous  analysis,  for 
every  30  pg/m^  increase  in  TSP  concentrations,  the  odds  ratio  increased  8%  (adjusted  OR 
=  1.08,  95%  Cl  =  0.96-1.21)  for  postmenopausal  women.  The  risk  estimates  for  the 
premenopausal  women  were  not  consistent  with  our  hypothesis.  In  this  group,  there  was 
a  non-significant  reduction  in  risk  in  the  highest  exposure  category  (adjusted  OR  =  0.66, 
95%  Cl  =  0.38-1.16).  Exposure  to  high  concentrations  of  TSP  at  the  time  of  first  birth 
was  also  associated  with  a  modest  increase  in  the  odds  ratio  for  postmenopausal  women 
(Table  4).  For  premenopausal  women  exposed  to  high  concentration  of  TSP,  there  was 
some  indication  of  a  non-significant  reduction  in  the  odds  ratio  (OR  =  0.52,  95%  Cl  = 
0.22-1.20). 

Discussion 

While  numerous  epidemiologic  studies  have  investigated  the  carcinogenieity  of  air 
pollution  in  relation  to  lung  cancer,  (24-26)  to  our  knowledge,  no  investigations  have 
examined  exposure  to  total  suspended  particulates  and  breast  cancer.  The  findings  from 
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this  study  suggest  that  early  life  exposure  to  high  concentrations  of  TSP,  a  proxy  measure 
of  PAHs,  may  be  associated  with  an  increased  risk  of  breast  cancer  in  postmenopausal 
women.  We  found  more  than  a  two-fold  increase  in  risk  for  those  with  a  birth  residence 
in  areas  where  exposure  was  greater  than  140  pg/m  compared  with  those  with  a  birth 
residence  where  concentration  was  less  than  84  pg/m  .  Exposure  at  menarche  and  first 
birth  were  less  strongly  associated  with  risk.  There  was  little  evidence  that  exposure  to 
high  concentrations  of  TSP  was  positively  associated  with  premenopausal  breast  cancer. 
However,  the  inconsistency  of  these  findings  for  women  in  this  group  may  be  attributable 
to  insufficient  induction  time  between  exposure  in  early  life  and  the  occurrence  of  breast 
cancer  and  to  secular  changes  in  exposure  levels. 

Several  previous  studies  have  examined  PAH  exposure  in  adult  life  in  relation  to 
cancer  {10,  11).  There  is  evidence  that  PAH-DNA  adducts  in  tumor  tissue  and  peripheral 
blood  tend  to  be  higher  in  breast  cancer  cases  that  in  controls.  Tumor  PAH-DNA  adducts 
levels  are  markers  of  recent  exposure  and  PAH-DNA  adducts  in  mononuclear  cells  are  at 
best  indicative  of  exposure  several  years  prior  to  collection.  Our  findings  are  based  on 
historical  estimates  of  early  life  exposure.  They  support  the  hypothesis  that  PAH 
exposure  may  be  associated  with  breast  cancer  risk  and  indicate  that  early  life  exposure  to 
these  compounds  may  have  particular  relevance  to  the  etiology  of  breast  cancer. 

Other  exposures,  particularly  ionizing  radiation,  have  been  observed  to  increase  risk  of 
breast  cancer  with  early  age  at  exposure.  Similarly,  exposure  to  PAHs  in  early  life  may 
also  confer  increased  risk  of  breast  cancer  compared  with  adult  exposure  to  PAHs.  In 
addition,  there  is  some  evidence  that  early  life  exposure  to  PAHs  could  impact  the 
developing  fetus.  In  a  study  of  early  life  exposure  to  high  PAHs  concentrations  in  air, 
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Perera  and  colleagues  found  PAH  exposure  to  be  associated  with  reduced  birth  weight, 
birth  length,  and  head  circumference  (27).  Several  studies  investigating  the  relationship 
between  birth  weight  and  the  risk  of  breast  cancer  have  observed  a  j-shaped  curve  with 
birth  weight:  Those  <2,500  g  at  birth  had  increased  risk  of  breast  cancer  compared 
with  women  with  birth  weights  of  2500-2999g  (28,  29). 

It  is  also  possible  that  PAHs  may  not  affect  breast  cancer  risk  and  our  findings  are  a 
result  of  other  carcinogens  and  co-carcinogens  found  in  total  suspended  particulates.  We 
speculated  that  PAHs  physically  associated  with  TSP  may  be  the  agent  responsible  for 
the  association  between  TSP  and  breast  cancer  risk  that  we  observed.  However,  we 
cannot  rule  out  the  possibility  that  other  compounds  present  in  TSP  are  affecting  breast 
cancer  risk  or  are  acting  synergistically  with  PAHs.  In  experimental  studies,  for  instance, 
application  of  coal  tar  produced  more  skin  tumors  than  did  the  application  of  only 
benzo(a)pyrene,  which  is  thought  to  be  the  primary  carcinogen  in  coal  tar.  Other 
constituents  in  coal  tar  seem  to  contribute  to  the  carcinogenic  potential  and  enhance 
synergistically  the  effect  of  benzo(a)pyrene  (P).  It  may  be  that  it  is  the  mixture  of 
compounds  in  TSP  that  is  relevant  to  breast  cancer  risk. 

Several  methodological  concerns  need  to  be  considered  when  interpreting  our 
findings.  Foremost  is  the  potential  for  selection  bias  to  affect  the  internal  validity  of  the 
study.  To  investigate  the  extent  of  the  geographic  selection  bias,  we  compeired  the 
geographic  distribution  of  breast  cancer  cases  in  the  study  with  that  of  the  breast  cancer 
cases  reported  to  the  New  York  State  Tumor  Registry.  The  expected  number  of  cases  per 
zip  code  in  Erie  and  Niagara  Counties  were  obtained  from  the  NY  State  Tumor  Registry 
and  compared  with  the  number  of  cases  identified  for  our  study.  Overall,  there  was  some 
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evidence  that  cases  identified  for  this  study  tended  to  reside  more  closely  to  the  study  site 
than  cases  identified  in  the  NY  State  Tumor  Registry.  When  the  expected  number  of 
controls  per  zip  code  (obtained  from  the  1990  U.S.  Census)  was  compared  with  the 
observed  number  of  controls,  controls  were  also  more  likely  to  currently  reside  more 
closely  to  the  study  site. 

In  addition,  there  is  the  possibility  that  our  results  were  biased  because  the  sample  was 
restricted  to  women  who  were  both  current  residents  of  Erie  or  Niagara  Coimties  at  the 
time  of  the  case-control  study  and  who  had  lived  there  during  their  earlier  life.  However, 
we  found  little  difference  between  those  subjects  with  birth  addresses  in  Erie  and  Niagara 
Counties  compared  with  those  subjects  with  birth  addresses  outside  of  these  two  counties 
with  regards  to  demographic  characteristics  or  established  risk  factors  (data  not  shown). 

Small  numbers  in  some  categories  and  the  resultant  large  confidence  intervals  affected 
our  ability  to  draw  conclusions  from  our  data.  The  distribution  of  TSP  concentrations 
contributed  to  the  small  numbers  in  certain  categories.  Ambient  TSP  concentrations  had 
large  spatial  variation  in  the  1960s,  but  in  general,  TSP  concentrations  were  high 
compared  with  later  time  periods.  However,  TSP  concentrations  began  to  decrease  in  the 
early  1970’s  leading  to  low  estimates  in  the  1970s-90’s  with  very  little  geographic 
variation  in  TSP  concentrations.  Consequently,  the  distributions  for  each  time  period 
were  very  different.  Few  postmenopausal  participants  were  exposed  to  low 
concentrations  at  birth  and  few  premenopausal  women  were  exposed  to  high 
concentrations  at  the  time  of  first  birth.  These  trends  in  ambient  air  concentration  of  TSP 
precluded  an  analysis  of  exposure  in  adult  life  up  to  the  time  of  diagnosis  because  the 
lack  of  variability.  In  order  to  be  able  to  make  comparisons  between  time  periods,  we 
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chose  to  use  a  common  cut  point  for  all  analyses.  The  cut  points  for  our  analyses  were 
arbitrarily  selected  based  on  the  distribution  of  the  TSP  measurements  in  the  1960s.  With 
the  majority  of  participants  having  had  high  levels  of  TSP  at  birth,  these  cut  points 
resulted  in  small  numbers  in  the  referent  group.  However,  the  continuous  and  spline 
regression  analyses  support  the  direction  of  the  association  in  postmenopausal  women. 

In  addition  to  the  secular  changes  in  ambient  TSP  concentrations,  TSP  is  a  relatively 
crude  measure  of  ambient  air  pollution.  In  1987,  it  was  replaced  with  particulate  matter 
<10  microns  {30).  Currently,  particulate  matter  <2.5  microns  is  considered  to  be  the 
most  relevant  measure  for  biologic  effects  of  air  pollution  because  these  fine  particles  are 
respired  into  the  lower  respiratory  tract  (5).  However,  TSP  concentrations  were  the  only 
consistently  measured  ambient  air  pollutant  in  the  early  1960’s,  the  period  before  the 
Clean  Air  Act,  which  led  to  reductions  in  ambient  air  pollution.  TSP  is  the  best  available 
measure  to  estimate  historical  exposure  to  air  pollution.  Nevertheless,  there  remains  the 
potential  for  exposure  misclassification  because  TSP  concentration  measurements  were 
used  as  a  surrogate  for  exposure  to  PAHs.  PAHs  exist  in  the  ambient  air  in  both  the 
gaseous  and  particulate  phase.  The  use  of  TSP  captures  exposure  to  PAHs  in  the 
particulate  phase  only  {31),  although  ambient  B(a)P  concentrations  were  highly 
correlated  (r  =  0.90)  with  TSP  concentrations  in  this  region.  In  addition,  the  interpolation 
method  used  to  estimate  concentrations  of  TSP  at  residential  addresses  likely  contributed 
some  error.  The  air  samplers  were  not  randomly  distributed  throughout  Erie  emd  Niagara 
Counties.  In  general,  air  samplers  were  placed  in  regions  thought  to  have  high  levels  of 
air  pollution.  Because  the  monitoring  system  was  not  designed  to  provide  county  wide 
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characterization  of  TSP  levels,  some  outlying  areas  were  never  monitored  and  were 
approximately  1 8  miles  from  the  closest  monitor. 

Another  potential  problem  in  assessing  exposure  arose  because  humans  are  peripatetic 
(52).  Therefore,  our  estimates  of  TSP  concentrations  are  site  specific  for  each  participant 
and  may  not  represent  exposures  at  other  places  where  these  participants  spent  time.  This 
is  likely  less  of  a  problem  for  the  analyses  of  birth  residence.  By  menarche,  however, 
these  participants  would  spend  a  considerable  proportion  of  their  time  away  from  home. 

In  summary,  we  examined  exposure  to  total  suspended  partieulates,  a  surrogate  for 
PAHs  exposure,  in  relation  to  the  risk  of  breast  cancer.  We  found  a  suggestion  of  an 
association  between  exposure  to  high  eoncentration  of  TSP  at  birth  and  an  increase  risk 
of  breast  cancer  in  postmenopausal  women.  Among  premenopausal  women,  there  was 
no  evidence  of  such  an  association  with  risk  of  breast  cancer.  While,  these  results  are 
suggestive,  they  necessarily  should  be  considered  preliminary.  Future  research  on  the 
effects  of  early  life  exposure  to  PAHs  and  other  related  compounds  is  warranted. 
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Figure  1.  Total  Suspended  Particulate  Concentrations  in  Erie  and  Niagara  Counties, 


Table  1.  Descriptive  Characteristics  for  Study  Participants  at  Birth,  Menarche,  First  Birth,  and  Overall  Study;  Western  New  York  Exposures  and  Breast  Cancer 


Table  1,  continued.  Descriptive  Characteristics  for  Study  Participants  at  Birth,  Menarche,  First  Birth,  and  Overall  Study:  Western  New  York  Exposures  and 


Table  2.  Risk  associated  with  exposure  to  Total  Suspended  Particulate  Concentrations  at  Birth  Address:  Western  New  York  Exposures  and  Breast  Cancer  Study 
(WEB  Study). 
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Abstract 

Evidence  is  increasing  that  some  early  life  exposures  affect  breast  cancer  risk.  Exposure 
to  environmental  tobacco  smoke  (ETS)  during  childhood  may  be  one  such  exposure.  As 
part  of  the  WEB  Study  (Western  New  York  Exposures  and  Breast  Cancer  Study),  we 
conducted  a  population-based,  case-control  study  with  1,166  women  aged  35-79 
diagnosed  with  histologically  confirmed,  primary,  incident  breast  cancer.  Controls 
(n=2,105)  were  randomly  selected  from  the  Department  of  Motor  Vehicles  driver’s 
license  list  (<age  65)  and  the  Healthcare  Financing  Administration  Medicare  rolls  (>age 
65).  Participants  were  queried  regarding  the  number  of  smokers  they  lived  with  and  the 
number  of  years  they  resided  with  these  smokers.  Person-years  of  ETS  exposure  was 
computed.  Unconditional  logistic  regression  adjusting  for  potential  confounders  was 
used  to  calculate  odds  ratios  (OR)  and  95%  confidence  intervals  (95%  Cl).  Exposure  to 
ETS  before  the  age  of  21  may  be  weakly  associated  with  an  increase  in  breast  cancer  risk 
for  premenopausal  women  (OR  =  1.46, 95%  Cl:  0.99-  2.15)  and  postmenopausal  (OR  = 

1 .22;  95%  Cl:  0.94-1 .58)  women.  Although  these  estimates  may  suggest  a  weak 
association  between  early  life  ETS  exposure  and  breast  cancer,  we  carmot  exclude  the 
possibility  that  ETS  exposure  is  unrelated  to  risk. 
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Introduction 

Increasingly,  there  is  interest  that  exposures  in  early  life  may  be  related  to  breast 
cancer  risk  because  of  evidence  that  the  breast  may  be  more  vulnerable  to  carcinogenic 
insults  during  this  period  of  tissue  proliferation  and  initial  differentiation.  Pregnancy  and 
lactation  result  in  further  differentiation  after  which  it  seems  that  breast  tissue  is  more 
resistant  to  carcinogenic  insults  It  has  been  hypothesized  that  exposures  before  a 
woman  gives  birth  for  the  first  time  may  be  particularly  important  in  relation  to  disease 
etiology.’’  *  Environmental  tobacco  smoke  is  one  such  exposure  that  may  affect  the  risk 
of  developing  breast  cancer. 

Exposure  to  environmental  tobacco  smoke  is  relatively  common  among  U.S. 
children  where  an  estimated  43%  reside  with  at  least  one  smoker.®  Tobacco  smoke 
consists  of  numerous  compounds  that  are  carcinogenic  to  several  organ  sites  including 
the  lung  bladder  and  pancreas.*"^  Among  these  compounds  are  polycyclic 
aromatic  hydrocarbons  (PAHs)  and  aromatic  amines.  PAHs  are  known  skin  and 
mammary  carcinogens  in  rodent  models  and  accumulate  in  adipose  tissue  including 
the  breast.^*’ '®  In  addition,  aromatic  amines  have  been  shown  to  be  mammary 
carcinogens  in  rodents.  The  effect  of  tobacco  smoke  on  breast  cancer  risk,  however,  is 
not  clear.  McMahon  ”  hypothesized  that  tobacco  smoke  may  reduce  the  risk  of  breast 
cancer  because  of  evidence  that  cigarette  smoke  has  antiestrogenic  effects.  Conversely, 
Hiatt  and  Fireman  ’’  reasoned  that  smoking  could  increase  breast  cancer  risk  because 
mutagens  from  cigarette  smoke  concentrate  in  the  breast  fluid  of  nonlactating  women. 
Cigarette  smoking  is  also  associated  with  bladder  and  pancreatic  cancers,  all  sites 
without  direct  contact  between  smoke  and  the  organ’s  epithelium.  Despite  conflicting 
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hypotheses  about  the  effect  of  tobacco  smoke  on  breast  cancer  risk,  an  association 
between  cigarette  smoking  and  breast  cancer  has  yet  to  be  clearly  demonstrated. 

Further,  there  is  limited  evidence  that  environmental  tobacco  smoke  (ETS)  affects  breast 
cancer  risk.^'*’  With  regards  to  early  life  exposure,  there  have  been  a  few  studies  of 
ETS  and  risk;  ETS  has  been  found  to  be  associated  with  increased  risk  of  breast  cancer  in 
some  but  not  in  all  studies.^^’^^ 

In  this  study,  we  explored  exposure  to  ETS  in  early  life  in  relation  to  the  risk  of 
subsequent  breast  cancer.  Specifically,  we  hypothesized  that  residing  with  one  or  more 
household  smokers  in  early  life  (up  to  age  21)  would  increase  the  risk  of  breast  cancer 
compared  with  women  who  did  not  reside  with  household  smokers. 

MATERIAL  AND  METHODS 

The  Western  New  York  Exposures  and  Breast  Cancer  Study  (WEB  Study)  is  a 
population-based,  case-control  study  in  Western  New  York.  Cases  (n=l,166)  included 
women  aged  35-79  years  diagnosed  with  histologically-confirmed,  primary,  incident 
breast  cancer  currently  residing  in  Erie  or  Niagara  Counties  in  Western  New  York. 

Nurse  case  finders  visited  the  pathology  departments  at  regular  intervals  to  identify  cases. 
After  identification,  the  case’s  physician  was  contracted  to  verify  the  diagnosis  of  breast 
cancer  and  to  obtain  permission  to  contact  the  case.  Cases  were  then  contacted  and  asked 
to  participate  in  the  study.  Controls  were  also  current  residents  of  Erie  and  Niagara 
counties  randomly  selected  from  the  New  York  State  Department  of  Motor  Vehicles 
driver’s  license  list  (aged  65  and  less)  and  the  Centers  for  Medicare  and  Medicaid 
Services  rolls  (over  65  years).  Controls  (n  =  2,105)  were  frequency  matched  by  age  and 
race.  A  total  of  1,638  cases  and  3,396  controls  met  our  inclusion  criteria  of  between  35- 
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79  years  of  age,  current  resident  of  Erie  or  Niagara  County,  no  previous  cancer  diagnosis 
other  than  non-melanoma  skin  cancer  and  an  ability  to  speak  English.  The  response  rates 
v^^ere  71%  (1,166/1,638)  and  62%  (2,105/3,396)  for  cases  and  controls,  respectively.  All 
participants  provided  informed  consent;  the  protocol  was  approved  by  the  University  at 
Buffalo  School  of  Medicine  and  Biomedical  Sciences’  and  participating  hospitals’ 
Institutional  Review  Boards 

Extensive  in-person  interviews  and  self-administered  questionnaires  were  used  to 
ascertain  medical  history,  diet,  lifetime  alcohol  consumption,  residential  history, 
occupational  history,  and  smoking  history.  We  evaluated  exposure  to  ETS  with  two 
methods.  First,  questions  about  exposure  to  environmental  tobacco  smoke  were  asked  for 
seven  age  periods:  1)  <21  years,  2)  21-30,  3)  31-40,  4)  41-50,  5)  51-60,  6)  61-70,  7)  >70. 
The  number  of  people  living  with  the  participant  who  smoked  cigarettes,  cigars,  or  pipes 
during  the  specified  time  period  was  ascertained.  In  addition,  participants  were  also 
asked  for  the  number  of  years  that  they  resided  with  these  smokers.  These  two  questions 
were  used  to  compute  person-years  of  ETS  exposure  for  participants  for  each  time  period. 
For  this  study,  we  only  considered  exposure  to  ETS  before  21  years  of  age.  Exposure  to 
ETS  was  categorized  into  three  groups:  1)  no  ETS  exposure  (0  person-years),  2)  >0  to 
person-years  of  ETS  exposure,  and  3)  >20  person-years  of  ETS  exposure.  The  cut 
point  of  20  person-years  of  ETS  exposure  was  derived  from  the  median  in  the  exposed 
controls. 

The  second  evaluation  of  ETS  exposure  was  part  of  the  residential  history 
assessment.  Participants  listed  each  residence  for  their  entire  life  with  corresponding 
information  on  the  number  of  other  people  who  resided  at  that  residence  and  the  number 
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of  those  residents  who  smoked  cigarettes.  The  analysis  was  restricted  to  those  with 
complete  household  smoking  information  at  both  birth  and  menarche  {n  =  334  for  cases 
and  609  for  controls).  For  exposure  at  the  time  of  first  birth,  the  analysis  was  fiirther 
restricted  to  those  with  residential  information  for  all  three  time  periods.  Household 
smoking  was  categorized  into  a  binary  variable  denoting  either  the  presence  or  absence  of 
household  smokers. 

Unconditional  logistic  regression  was  used  to  calculate  odds  ratios  (OR)  and  95 
%  confidence  intervals  (Cl),  adjusting  for  age,  race,  education,  age  at  first  birth,  age  at 
menarche,  parity,  previous  benign  breast  disease,  family  history  of  breast  cancer  in  a  first 
degree  relative,  body  mass  index  (weight  (kg)/height  (m)^),  pack-years  of  smoking,  total 
lifetime  alcohol  consumption,  and  age  at  menopause  for  postmenopausal  women  only. 

All  models  were  stratified  by  menopausal  status.  A  reduced  model  including  age 
previous  benign  breast  disease,  and  pack-years  of  smoking  was  determined  by  removing 
covariates  that  did  not  alter  the  OR  by  more  than  10%.  Additional  analyses  were 
conducted  excluding  former  and  current  smokers  to  prevent  a  history  of  active  smoking 
from  confounding  any  potential  association  between  ETS  and  the  risk  of  breast  cancer.  P 
for  trend  statistics  was  determined  by  the  />- value  for  the  coefficient  of  the  continuous 
exposure  variable,  while  adjusting  for  covariates. 

RESULTS 

Demographic  characteristics  of  the  study  participants  by  menopausal  status  are 
shown  in  table  1.  Exposure  to  ETS  before  the  age  of  21  was  associated  with  an  increase 
in  the  risk  of  breast  cancer  for  both  premenopausal  women  (reduced  model  OR  =  1.46, 
95%  Cl:  0.99, 2.15)  and  postmenopausal  women  (reduced  model  OR  =  1.22,  95%  Cl: 
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0.94-1.58)  (table  2);  although,  confidence  intervals  include  the  null  for  both  groups. 
When  the  analysis  was  restricted  to  the  sub-group  of  never  smokers,  similar  results  were 
obtained,  although  confidence  intervals  were  wider  because  of  the  decrease  in  sample 
size  (data  not  shown). 

Associations  between  the  presence  of  household  smokers  in  the  participant’s 
residence  at  the  time  of  their  birth,  menarche,  and  first  birth  and  breast  cancer  among 
never  smokers  are  shown  in  table  3.  There  was  some  tendency  for  premenopausal 
women  with  breast  cancer  to  reside  with  one  or  more  household  smokers  at  their  birth 
address  more  often  than  controls  (reduced  model  OR  =  1 .40,  95%  Cl:  0.83-2.38),  while 
in  postmenopausal  women,  the  presence  of  household  smokers  was  not  associated  with 
breast  cancer  (reduced  model  OR  =  1.02,  95%  Cl:  0.68-1.55).  Associations  between  the 
presence  of  household  smokers  at  the  time  of  menarche  and  breast  cancer  were  similar. 

The  presence  of  household  smokers  at  the  time  of  a  women’s  first  birth  was  not 
associated  with  breast  cancer  in  premenopausal  women  (reduced  model  OR  =  1 .24,  95% 
Cl:  0.70-2.21).  For  postmenopausal  women,  however,  exposure  to  household  smoke  at 
the  time  of  first  birth  was  suggestive,  if  anything,  of  a  reduction  in  risk  (reduced  OR  = 
0.71,  95%  Cl:  0.46-1 .09).  We  attempted  to  examine  each  time  period  while  adjusting  for 
the  other  two  time  periods  to  investigate  whether  one  time  period  in  particular  was 
associated  with  an  increased  odds  ratio.  However,  household  smoking  status  at  each  of 
the  time  periods  was  highly  correlated  and  the  results  were  not  interpretable. 

DISCUSSION 

Overall,  this  study  provides  little  evidence  that  exposure  to  ETS  in  early  life  is 
associated  with  an  inerease  in  the  risk  of  breast  cancer.  In  the  few  studies  that  have 
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examined  early  life  exposure  to  ETS,  the  results  have  been  mixed.  In  one  study,  Sandler 
et  al.  found  no  increase  in  the  risk  of  breast  cancer  in  women  exposed  to  either 
maternal  or  paternal  household  smoking  before  participants  attained  10  years  of  age.  In 
another  study,  Smith  et  al.  assessed  exposure  to  ETS  up  to  age  16  and  found  women 
exposed  to  ETS  only  in  childhood  had  an  OR  of  1 .98  (95%  Cl;  0.35-1 1 .36)  compared 
with  those  never  exposed.  In  addition,  women  exposed  in  childhood  to  201-400 
cigarette-years  were  observed  to  have  an  OR  of  2.09  (95%  Cl:  1.05-4.16).  However,  the 
OR  was  1.51  (95%  Cl:  0.72-3.20)  in  women  exposed  in  childhood  to  >400  cigarette- 
years.  Smith  et  al.  concluded  that  there  was  no  association  between  ETS  exposure  in 
childhood  and  breast  cancer.  Lash  et  al.  examined  women  who  were  exposed  to  ETS 
before  the  age  of  12  and  found  an  OR  of  4.5  (95%  Cl:  1.2-16).  For  women  exposed  to 
ETS  alone,  the  OR  was  7.5  (95%  Cl:  1.6-36).  However,  these  results  were  not  replicated 
in  a  more  recent  case-control  study  where  exposure  to  ETS  before  the  age  of  13  was  not 
associated  with  an  increase  in  the  risk  of  breast  cancer  (OR:  1.1, 95%  Cl:  0.4-3. 0).^’ 

To  our  knowledge,  this  study  is  the  first  to  specifically  examine  ETS  exposure  at 
birth,  menarche,  and  first  birth.  However,  interpretation  of  these  results  was  difficult 
because  of  the  modest  risk  estimates  and  wide  confidence  intervals.  The  estimated  ORs 
suggest  there  may  be  a  weak  association  with  exposure  to  household  smokers  in  early 
life,  particularly  around  the  time  of  birth  and  menarche. 

Tobacco  smoke  contains  potent  carcinogens  and  there  is  evidence  that  these  are 
deposited  in  breast  tissue.'*’  Consequently,  it  is  biologically  plausible  that  ETS  could 
be  a  risk  factor  for  breast  cancer.  Furthermore,  the  timing  of  exposure  may  be  crucial  in 
defining  the  role  of  ETS  and  active  smoking  in  the  etiology  of  breast  cancer.^®  In 
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addition  to  PAHs  and  other  carcinogenic  compounds,  cigarette  smoke  also  contains 
carbon  monoxide,  which  is  an  inhibitor  of  the  cytochrome  P-450s.  It  is  possible  that  by 
inhibiting  the  cytochrome  P-450s,  carbon  monoxide  prevents  PAHs  from  being 
metabolized  to  their  ultimate  carcinogen  moiety.*^  If  this  is  occurring,  then  PAHs  in 
cigarette  smoke  would  not  contribute  to  carcinogenesis  and  the  use  of  smoking  as  a 
surrogate  for  PAH  exposure  would  be  misleading.  Conversely,  cigarette  smoke  has  been 
hypothesized  to  be  anti-estrogenic  and  therefore  may  reduce  the  risk  of  breast  cancer, 
although  our  results  do  not  support  this  hypothesis.  It  may  be  that  genetically 
heterogeneous  study  populations  have  obscured  the  net  effect  that  cigarette  smoke  may 

o  1  '2') 

have  on  breast  cancer  risk.  ’ 

There  are  several  limitations  of  this  study  that  should  be  considered  when 
interpreting  the  results.  Among  these  is  recall  bias.  While  such  a  bias  is  possible,  it 
would  seem  less  likely,  given  the  request  for  information  pertained  to  childhood 
experiences  and  that  there  is  no  well  known  hypothesis  linking  ETS  exposure  in  early  life 
and  breast  cancer  risk.  Misclassification  of  exposure  is  likely  given  that  ETS  was  crudely 
measured  and  did  not  take  into  account  other  sources  of  ETS.  Further  misclassification 
of  ETS  exposure  could  have  occurred  because  the  some  smokers  may  not  have  smoked  in 
the  presence  of  that  participant.  In  particular,  we  could  not  distinguish  smokers  who 
restricted  their  smoking  activities  around  the  participant,  thereby  decreasing  exposure, 
from  those  who  did  not.  In  addition,  we  assumed  that  early  life  exposure  to  ETS  would 
predominantly  occur  in  the  household.  This  is  particularly  likely  for  the  time  period 
between  birth  and  menarche.  Regardless,  this  measure  is  not  quantitative  and  the 
potential  for  non-differential  misclassification  of  exposure  exists. 
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In  addition,  the  possibility  of  selection  bias  cannot  be  ruled  out.  Comparisons 
between  respondents  and  non-respondents  indicated  that  smokers  were  less  likely  to 
participate  in  this  study.  Since  smokers  are  more  likely  to  have  parents  who  smoked  ,  a 
selection  bias  may  have  altered  the  distribution  of  ETS  exposure  in  the  controls  from  that 
of  the  source  population  from  which  the  cases  arose  resulting  in  magnified  risk  estimates. 

In  summary,  our  study  examined  exposure  to  household  tobacco  smoke  up  to  the 
21  years  of  age  and  at  the  time  of  birth,  menarche,  and  first  birth  in  relation  to  the 
development  of  breast  cancer.  We  had  hypothesized  that  the  chemical  carcinogens 
present  in  tobacco  smoke  such  as  PAHs  would  affect  breast  cancer  risk  and  that  exposure 
to  tobacco  smoke  in  early  life  would  have  particular  importance.  Although  these 
estimates  may  suggest  a  weak  association  between  early  life  ETS  exposure  and  breast 
cancer,  we  cannot  exclude  the  possibility  that  ETS  exposure  is  unrelated  to  risk.  The 
recent  trends  toward  limiting  ETS  exposure  particularly  for  children  remains  appropriate, 
given  our  knowledge  of  other  effects  of  ETS  on  health  and  the  relatively  high 
prevalence  of  ETS  exposure  in  the  U.S.  population.® 

Financial  Support:  This  work  has  been  supported  in  part  by  CA-0905 1  NCI, 
5R21CA8713802  NCI,  and  DAMD-170010417. 
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TABLE  1 .  Descriptive  Characteristics  of  Study  Participants:  Western  New  York  Exposures  and  Breast 
Cancer  Study  (WEB  Study)  (1996-2001). 

Premenopausal  Postmenopausal 

Cases  Controls  Cases  Controls 


(«=610)  («=841)  («=1495) 


(«=325) 


Age 


35-45 

176  (54%) 

46-55 

149  (46%) 

56-65 

- 

66-75 

- 

76+ 

- 

Education 

<High  school 

2  (1%) 

High  school 

105  (32%) 

>High  school 

218(67%) 

Age  at  Menarche 

<12 

80  (25%) 

12-13 

212(65%) 

14+ 

33  (10%) 

Age  at  Menopause 

<45 

- 

45-49 

" 

50-54 

384  (63%) 

7  (1%) 

29  (2%) 

224  (37%) 

175  (21%) 

325  (22%) 

2  (1%) 

325  (39%) 

403  (27%) 

- 

262(31%) 

630  (42%) 

- 

72  (9%) 

108  (7%) 

- 

22  (3%) 

38  (3%) 

176  (32%) 

388  (46%) 

780  (52%) 

434  (71%) 

431  (51%) 

677  (45%) 

134  (22%) 

199  (24%) 

327  (22%) 

414  (68%) 

548  (65%) 

966  (65%) 

62  (10%) 

94(11%) 

202  (14%) 

- 

157  (19%) 

389  (26%) 

- 

222  (26%) 

374  (25%) 

- 

373  (44%) 

582  (39%) 

- 

89(11%) 

150(10%) 

55+ 
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TABLE  1,  continued.  Descriptive  Characteristics  of  Study  Participants:  Western  New  York  Exposures  and 
Breast  Cancer  Study  (WEB  Study)  (1996-2001). 


Premenopausal 

Postmenopausal 

Cases 

(«=325) 

Controls 

(«=610) 

Cases 

(n=841) 

Controls 

(m=1495) 

Age  at  First  Birth 

Never 

58(18%) 

99  (16%) 

148(18%) 

155  (10%) 

13-19 

45  (14%) 

38  (6%) 

91  (11%) 

203  (14%) 

20-21 

29  (9%) 

62(10%) 

150  (18%) 

259  (17%) 

22-25 

73  (22%) 

157(26%) 

271  (32%) 

512(34%) 

26-39 

120  (37%) 

254  (42%) 

181  (22%) 

366  (24%) 

Parity 

0 

58(18%) 

99  (16%) 

148(18%) 

155  (10%) 

1-2 

177  (54%) 

323  (53%) 

293  (35%) 

452  (30%) 

3+ 

90  (28%) 

188(31%) 

400  (48%) 

888  (59%) 

Body  Mass  Index 

<25 

144  (44%) 

266  (44%) 

243  (29%) 

448  (30%) 

25-29 

93  (29%) 

168  (28%) 

277  (33%) 

567  (38%) 

30+ 

88  (27%) 

176(29%) 

321  (38%) 

480  (32%) 

Benign  Breast  Disease 

120  (37%) 

130(21%) 

278  (33%) 

327  (22%) 

(yes) 

Relative  with  Breast 

61  (21%) 

56  (10%) 

160  (20%) 

196(14%) 

Cancer  (yes) 

Smoking  Status 

Never 

149  (46%) 

326  (54%) 

376  (45%) 

686  (46%) 

Former 

123  (38%) 

182(30%) 

365  (44%) 

584  (39%) 

Current 

53  (16%) 

101  (17%) 

99  (12%) 

224(15%) 

TABLE  2.  Risk  of  Breast  Cancer  Associated  with  Exposure  to  Environmental  Tobacco  Smoke  before  the  age  21:  Western  New  York  Exposures  and  Breast 


TABLE  3.  Risk  of  Breast  Cancer  Associated  with  Exposure  to  Environmental  Tobacco  Smoke  Exposure  at  the  Time  of  Birth  Menarche  and  First  Birth;  among 


first  birth,  family  history  of  breast  cancer,  total  alcohol  consumption,  and  age  at  menopause  for  postmenopausal  women  only. 


APPENDIX  V 


Positional  Accuraq^  of  Geocoding  in  Epidemiologic  Research.  Matthew  R.  Bonner^ 
Daikwon  Han^,  Jing  Nie*  Jo  L.  Freudenheim^,  Peter  Rogerson^,John  E.  Vena^ 
Epidemiology,  (accepted  for  publication  July  2003) 

Geographic  Information  Systems  (GIS)  offer  powerful  techniques  for  epidemiologists. 
Geocoding  is  an  important  step  in  the  use  of  GIS  in  epidemiologic  research  and  the 
validity  of  any  epidemiologic  study  using  this  methodology  depends,  in  part,  on  the 
positional  accuracy  of  the  geocoding  process.  We  conducted  a  study  comparing  the 
validity  of  positions  geocoded  with  a  commercially  available  program  to  positions 
determined  by  receivers  for  the  Global  Positioning  System  (GPS)  satellites. 

Methods: 

Addresses  (n=200)  were  randomly  selected  from  a  recently  completed  case-control 
study  in  Western  New  York.  These  addresses  were  geocoded  using  Arc  View  3.2  on  the 
GDT  Dynamap/2000  U.S.  Street  database.  Latitude  and  longitude  of  these  same 
addresses  were  measured  with  a  GPS  receiver,  and  distance  between  these  two  points  was 
calculated  for  all  addresses. 

Results: 

The  distance  between  the  geocoded  point  and  the  GPS  point  was  within  100m  for  the 
majority  of  the  all  subject  addresses  (79%)  with  only  a  small  proportion  (3%)  having  a 
distance  greater  than  800m.  The  overall  median  distance  between  GPS  points  and 
geocoded  points  was  38m  (90%  Cl  33.67-45.90).  Distances  were  not  different  for  cases 
and  controls.  Urban  addresses  (32m;  90%  Cl  28.32-36.81)  were  slightly  more  accurate 
compared  to  the  non-urban  addresses  (52m;  90%  Cl  43.51-61.06). 

Conclusions:  . 

Overall,  this  study  indicates  that  the  suitability  of  geocoding  for  epidemiologic 
research  depends  on  the  level  of  spatial  resolution  required  to  assess  exposure.  While 
sources  of  error  in  positional  accuracy  for  geocoded  addresses  exist,  geocoding  of 
addresses  is  largely  very  accurate. 


Household  smoke  exposure  in  early  life  and  breast  cancer  in  Western  New  York. 
Bonner  MR,  Nie  J,  Han  D,  Vito  D,Vena  JE,  Rogerson  P,  Muti  P,  Trevisan  M, 
Freudenheim  JL.  American  Association  for  Cancer  Research,  Toronto,  Canada, 
April,  2003. 

Exposure  to  tobacco  smoke  in  early  life  may  be  more  relevant  for  breast  cancer  than 
exposure  in  adult  life.  Numerous  epidemiologic  studies  of  adult  smoking  exposure  have 
been  equivocal.  Relatively  few  investigations,  however,  have  examined  tobacco  smoke 
exposure  in  early  life  when  breast  epithelium  may  be  more  sensitive  to  carcinogens.  In 
this  study,  we  hypothesized  that  household  tobacco  smoke  exposure  during  critical  time 
periods  of  breast  development  (birth,  menarche,  and  first  birth)  may  be  associated  with 
the  occurrence  of  breast  cancer.  As  part  of  the  Center  for  Preventive  Medicine,  we  used  a 
case-control  study  design  with  1,170  cases  of  primary,  histologically  confirmed,  incident 
breast  cancer  and  2,1 16  population-based  controls.  Exposure  to  household  smokers  at 
birth,  at  menarche  and  at  first  birth  was  assessed  with  a  self-administered  residential 
history  questionnaire.  Each  subject  indicated  all  previous  residences  as  well  as  the 
number  of  other  people  residing  at  that  address  and  the  number  of  those  household 
residents  who  smoked.  We  categorized  the  number  of  household  smokers  into  none,  one, 
and  two  or  more  household  smokers.  Logistic  regression  was  used  to  estimate  odds 
ratios  (OR)  and  95%  confidence  intervals  (95%  Cl)  with  no  household  smokers  as  the 
referent  category.  Multivariate  logistic  models  were  adjusted  for  age  at  interview,  years 
of  education,  previous  benign  breast  disease,  age  at  menarche,  parity,  body  mass  index, 
total  lifetime  dcohol  consumption,  and  relative  with  breast  cancer.  We  foxmd  that 
women  who  at  birth  resided  with  1  or  more  household  smokers,  were  more  likely  to 
develop  breast  cancer  compared  to  those  residing  with  no  household  smokers  (adjusted 
OR  =  1.36,  95%  Cl  =  1.08-1.70).  A  similar  association  was  also  observed  for  women 
who  at  menarche  were  exposed  to  household  smokers  (adjusted  OR  =  1.43, 95%  Cl 
=1.15-1.77).  Exposure  to  household  smokers  at  the  time  of  first  birth  was  more  weakly 
associated  with  breast  cancer  (adjusted  OR  =1.19, 95%  Cl  =  0.96-1 .47).  In  a  logistic 
model  simultaneously  adjusting  for  household  smoke  exposure  at  all  three  time  periods, 
only  exposure  to  household  smokers  at  the  time  of  menarche  remained  above  unity 
(adjusted  OR  =  1.77,  95%  Cl  =  1.00-3.15).  However,  exposure  to  household  smoke  in 
Aese  time  periods  tended  to  be  correlated.  These  results  suggest  that  household  smoke 
exposure  in  early  life  may  be  associated  with  an  increase  in  the  likelihood  of  breast 
cancer  and  it  may  be  that  exposure  at  the  time  of  menarche  is  more  important  than 
exposure  at  other  time  periods. 


Clustering  of  Lifetime  Residence  and  Breast  Cancer  Risk  in  Western  New  York 
*D  Han,  MR  Bonner,  J  Nie,  PA  Rogerson,  JE  Vena,  P  Muti,  M  Trevisan,  JL 
Freudenheim.  University  at  Buffalo,  Buffalo,  NY  14214. 

In  order  to  investigate  the  role  of  environmental  exposures  on  breast  cancer,  we 
examined  breast  cancer  risk  associated  with  lifetime  residential  history  using  GIS-based 
exploratory  spatial  analyses.  Data  on  residential  history  and  risk  factors  were  collected  as 
part  of  a  population-based  case  control  study  of  incident,  primary,  histologically- 
confirmed  breast  cancer  in  western  New  York  Controls  were  jfrequency  matched  to  cases 
on  age  and  county  of  residence.  Relative  risk  surfaces  of  cases  and  controls  were” 
identified  to  depict  elevated  areas  of  breast  cancer  risk  using  kernel  smoothing  methods. 
The  ratio  of  cases  to  controls  was  first  obtained  based  on  location  of  their  residence  for 
each  participant  at  the  time  of  birth,  menarche,  first  birth,  and  10  and  20  years  before 
interview,  then  adjusted  for  established  breast  cancer  risk  factors  using  a  generalized 
additive  model.  Cumulative  risk  surfaces  were  constructed  by  using  case-control 
densities  from  each  temporal  group.  These  surfaces  were  compared  between  residences 
for  pre-menopausal  and  post-menopausal  women.  We  found  a  general  tendency  of  spatial 
clustering  of  lifetime  residence,  and  we  observed  strong  evidence  of  clustering  of  lifetime 
residence  for  pre-menopausal  women  relative  to  that  for  post-menopausal  women.  We 
were  able  to  pinpoint  geographic  areas  with  higher  cximulative  densities,  but  also  to 
identify  the  role  of  early  exposures  through  exploratory  spatial  analyses.  Our  findings 
suggest  that  there  may  be  identifiable  etiological  processes  on  exposure  and  breast  cancer 
risk,  especially  for  pre-menopausal  women,  and  that  early  exposures  may  be  of  particular 
importance. 


Total  Suspended  Particulate  Exposure  in  Early  Life  and  Breast  Cancer. *MR 
Bonner,  D  Han,  J  Nie,  JE  Vena,  P  Rogerson,  P  Muti,  M  Trevisan,  D  Vito,  JL 
Freudenheim.  University  at  Buffalo,  Buffalo,  NY  14214. 

Polycyclic  aromatic  hydrocarbons  (PAHs)  are  ubiquitous  environmental 
pollutants  present  in  air  pollution  and  largely  associated  with  particulate  matter.  PAHs 
may  be  estrogenic  and  could  contribute  to  breast  cancer  etiology.  Further,  early  life 
exposures  may  be  significant  in  the  development  of  this  disease.  We  examined  total 
suspended  particulate  (TSP)  exposure  (as  a  proxy  for  PAH  exposure)  m  early  life  in 
relation  to  &e  risk  of  breast  cancer.  We  conducted  a  population-based  case-control  study 
with  1,170  cases  of  primary,  histologically  confirmed,  incident  breast  cancer  and  2,1 16 
randomly  selected  controls.  TSP  concentrations  measured  by  air  monitoring  samplers 
fi-om  1958-1991  in  Erie  and  Niagara  counties  were  used  to  estimate  TSP  exposxire. 
Average  TSP  concentrations  were  computed  for  each  decade  and  inverse  distance 
squared  weighting  interpolation  was  used  to  estimate  TSP  concentrations  for  each 
subject’s  residence  at  birth,  menarche,  and  first  birth.  Logistic  regression  was  used  to 
estimate  odds  ratios  (OR)  and  95%  confidence  intervals  (95%  Cl),  adjusting  for  potential 
confounders.  No  association  in  risk  was  observed  in  premenopausal  women  for  exposure 
to  TSP.  In  postmenopausal  women,  the  continuous  adjusted  OR  was  1.21  (95%  Cl  1.05- 
1 .40)  for  every  30  mg/m^  increase  in  exposure  to  TSP  at  the  birth  residence.  In  this 
group,  risk  associated  with  exposure  of  over  135  mg/m^  of  TSP  exposure  at  the  time  of 
birth  compared  with  postmenopausal  women  with  <81  mg/m^,  was  2.59  (95%  Cl  0.96- 
7.03).  These  results  suggest  that  high  levels  of  exposure  in  early  life  to  TSP  may  be 
associated  with  an  increase  in  the  risk  of  postmenopausal  breast  cancer. 


Daikwon  Han,  “Geographical  Epidemiology  of  Breast  Cancer  in  Western  New 

York:  Migration  and  Disease  Clustering,”  Annual  Meeting  of  the  Association  of 
American  Geographers,  Los  Angeles,  CA.  2002. 

Migration  has  a  significant  effect  on  geographic  variations  of  disease  and  health 
outcomes.  The  complex  process  of  human  movement  is  one  of  the  complicating  factors  in 
explaining  the  causal  relationships  between  disease  and  environment,  but  also  an 
important  determinant  of  human  health  due  to  the  exposure  to  disease  through  movement. 
This  study  explores  the  migration  effects  on  disease  clustering  to  assess;  1)  the 
importance  of  residential  locations  to  the  risk  of  breast  cancer,  2)  the  statistical 
significance  of  clustering  with  migration  effects.  To  identify  the  reasons  for  geographic 
variations  of  disease,  the  study  presents  hypotheses  associated  with  migration  and  disease 
risks.  Exploratory  analyses  in  a  GIS  environment  are  used  to  detect  the  spatial-temporal 
patterns  of  residential  locations  and  clustering  of  case-controls  in  Western  New  York. 

The  overall  effects  of  migration  on  disease  clustering  are  identified  by  comparing  the 
lifetime  residential  history  of  case-controls,  after  controlling  for  the  known  risk  factors 
such  as  age  and  history  of  breast  cancer.  The  investigation  on  the  role  of  migration  on 
disease  clustering  processes  provide  explanations  on  the  consequences  of  in-  and  out- 
movement  of  people  diagnosed  with  disease  on  the  risk  of  disease  as  well  as  on  the 
spatial  variations  of  disease.  Once  significant  clusters  are  identified,  further  work  is 
required  to  investigate  the  relationships  between  residential  changes  and  environmental 
exposures  in  explaining  unknown  etiology  of  breast  cancer. 


Residential  Proximity  at  Birth  to  Industrial  Sites  and  Subsequent 
Risk  of  Breast  Cancer,  MR  Bonner,  D  Han,  J  Nie,  JL  Freudenheim, 
JE  Vena,  American  College  of  Epidemiology  Annual  Meeting, 
September,  2002,  Albuquerque,  NM. 

Purpose:  To  investigate  the  relationship  between  residential  proximity  at 
birth  to  industrial  sites  contracted  by  the  Atomic  Energy  Commission 
(AEC)  to  process  radioactive  material  and  the  subsequent  development  of 
breast  cancer  (BC)  in  pre  and  post  menopausal  women. 

Methods:  We  used  a  completed  case-control  study  (n=3,335)  and 
restricted  subjects  to  lifetime  residents  of  Western  New  York  (n=l,181). 
Subjects  were  further  restricted  to  those  bom  in  1940  and  later  because  the 
first  industrial  sites  began  operating  under  the  AEC  contract  in  1940.  A 
total  of  266  primary  incident  breast  cancer  cases  and  411  controls 
frequency  matched  by  age  were  included  in  this  analysis.  Exposure  was 
assessed  as  distance  (in  miles)  of  residence  at  birth  to  the  13  industrial 
sites.  The  closest  site  was  then  selected  for  each  subject  as  a  surrogate  for 
environmental  exposure.  The  distance  to  the  closest  site  was  categorized 
into  quartiles  based  on  the  distribution  in  the  controls.  Odds  ratios  (OR) 
and  95%  confidence  intervals  (95%  Cl)  were  used  to  estimate  the 
association  between  residential  proximity  and  subsequent  BC.  The  ORs 
were  adjusted  for  age,  education,  age  at  menarche,  parity,  and  age  at  first 
birth. 

Results:  We  observed  an  adjusted  OR  of  3.8  (95%  Cl  1.9-7. 7)  for 
premenopausal  women  residing  less  than  2.45  miles  from  the  closest  site 
when  compcired  to  women  residing  greater  than  8  miles  from  the  closest 
industrial  site.  No  such  associations  were  observed  in  post  menopausal 
women. 

Conclusion:  These  preliminary  findings  suggest  that  relatively  close 
residential  proximity  to  industrial  sites  involved  in  uranium  processing 
may  increase  the  risk  of  premenopausal  BC.  However,  it  is  unclear 
whether  this  association  can  be  attributed  to  the  environmental 
contamination  with  radioactive  material,  or  some  other  environmental 
contaminate  also  produced  at  these  industrial  sites. 


Daikwon  Han,  Jing  Nie,  Matthew  Bonner,  Dominiea  Vito,  Jo  Freudenheim, 
“Environmental  Exposures  Associated  with  Lifetime  Residential  History:  A  GIS- 
based 

Clustering  Analysis  of  Breast  Cancer,”  Annual  Meeting  of  the  Society  for 
Epidemiologic 

Research,  Palm  Desert,  CA.  2002. 

There  is  increasing  evidence  that  early  exposures  may  be  related  to  risk  of  breast 
cancer.  We  were  interested  in  whether  there  was  clustering  of  breast  cancer  based  on 
their  residence  in  early  life  and  identified  spatio-temporal  clustering  of  cases  and  controls 
at  critical  time  periods,  residential  locations  at  birth,  at  menarche,  and  at  the  women’s  first 
birth.  Data  used  here  were  part  of  the  Center  for  Preventive  Medicine  case  control  study 
of  incident,  pathologically  confirmed  breast  cancer  (1996-2001)  in  Erie  and  Niagara 
counties.  Controls  were  frequency  matched  on  age  and  county  of  residence;  controls  less 
than  65  were  randomly  selected  from  the  New  York  State  Department  of  Motor  Vehicles 
list  and  those  greater  than  65  from  the  Health  Care  Finance  Administration  list.  All  cases 
and  controls  provided  lifetime  residential  histories.  The  spatial  k-function  method  was 
used  to  calculate  the  distance  between  each  residence  within  a  certain  search  radius  and 
to  compare  observed  with  expected  patterns  over  pre-specified  distances.  We  found  a 
general  tendency  of  spatial  clustering  for  cases  for  tihese  time  periods,  especially  at  small 
geographic  scales,  compared  with  the  simulated  theoretical  distribution  of  expected  patterns. 
The  evidence  for  clustered  residence  at  birth  and  at  menarche  was  stronger  than  that  for 
first  birth.  This  study  provides  additional  evidence  that  early  environmental  exposures 
may  be  related  to  breast  cancer  risk. 


"Exploratory  Spatial  Analyses  of  Lifetime  Breast  Cancer  Risk  and  Residence 
History"  Daikwon  Han,  Jo  L.  Freudenheim,  Peter  A.  Rogerson,  Matthew  R. 
Bonner,  Jing  Nie.  Annual  Meeting  of  the  Association  of  American  Geographers, 
New  Orleans,  LA.  2003. 

This  research  investigates  lifetime  breast  cancer  risk  associated  with  residential 
history  based  upon  epidemiologic  methods  and  exploratory  spatial  analyses.  Data  were 
drawn  from  a  case  control  study  of  breast  cancer  in  western  New  York  and  provided 
information  on  lifetime  residential  history  and  risk  factors  for  1170  breast  cancer  cases 
and  2116  controls.  Epidemiologic  methods  were  utilized  to  identify  relationships  between 
breast  cancer  risk  and  residence  history.  The  ratio  of  cases  to  controls  was  obtained- 
based  on  residential  location  and  these  ratios  were  adjusted  for  established  risk  factors, 
including  age,  education,  and  history  of  benign  breast  disease.  Density  surfaces  of  cases 
and  controls  were  created  to  identify  elevated  areas  of  breast  cancer  risk  using  kernel 
smoothing  methods,  and  these  were  repeated  for  six  temporal  groups;  residences  at  birth, 
at  menarche,  at  women's  first  birth,  20  years  prior  to  diagnosis,  10  years  prior  to 
diagnosis,  and  current  addresses.  Lifetime  risk  surfaces  were  constmcted  and  visualized 
by  using  case-control  densities  from  each  temporal  group.  These  surfaces  were  further 
analyzed  using  weights  dependent  upon  length  of  residence. 


^^Residential  Proximity  to  Chemical  or  Primary  Metal  Industry  and  the  Risk  of 
Breast  Cancer  in  Western  New  York”  *J.  Nie,  Bonner  M,  Han  D,  LaFalce  J,  Vena 
JE,  Freudenheim  JL  Presented  at  the  Annual  Meeting  of  the  Society  for 
Epidemiologic  Research,  Atlanta,  GA,  June  2003. 

Women  living  in  urban  environments  are  at  greater  risk  of  breast  cancer  than 
those  in  rural  settings;  this  difference  is  not  well  understood.  In  this  study,  we  examined 
residential  proximity  to  chemical  or  primary  metal  industry  in  relation  to  breast  cancer 
risk.  Women,  age  35-79  with  incident,  primary,  histologically  confirmed  breast  cancer 
living  in  Erie  or  Niagara  counties  were  invited  to  participate;  and  controls  were 
population  based,  firequency  matched  to  cases  on  age  and  race.  Self-reported  lifetime 
residential  histories  were  collected.  863  cases  and  .1579  controls  with  complete  residential 
addresses  for  the  periods  10  and  20  years  prior  to  interview  were  included  studying  these 
analyses.  Industrial  directories  for  New  York  State  for  1978  and  1988,  were  used  identify 
chemical  and  primary  metal  factories  operating  in  this  region.  The  chemical  facility  in 
our  study  includes  Standard  Industrial  Classification  (SIC)  groups  28(Chemicals  and 
allied  products),  29(Petroleum  refining  and  related  industries),  and  30(Rubber  and 
miscellaneous  plastics  products);  and  primary  metal  facility  (SIC  33).  Quartiles  were 
created  to  categorize  the  distance  fi'om  residential  address  to  the  closest  industrial  site; 
women  living  within  0.25  mile  of  a  facility  were  put  in  a  separate  category.  We  used 
logistic  regression  to  calculate  the  odds  ratios  and  95%  confidence  intervals,  adjusting  for 
potential  confounding  factors.  For  both  time  periods  and  for  both  pre-  and 
postmenopausal  women,  there  was  no  evidence  that  living  close  to  chemical  or  primary 
metal  facility  10  and  20  years  ago  was  associated  with  increased  breast  cancer  risk. 


